7,419 Matching Annotations
  1. Nov 2025
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript describes evidence for a role for the Nuclear distribution C dynein complex regulator (NudC) in ribosome biogenesis (RiBi) independent of its role in microtubule-associated dynein function.

      Evidence: NudC was picked up in a screen for genes affecting ecdysteroid biosynthesis, a process that occurs in the prothoracic gland (PG; an endocrine organ). In the absence of ecdysone, larvae fail to pupate. Consistent with this finding, the authors find that prothoracic RNAi knockdown of NudC results in a failure in pupation and a decrease in total PG size. They also show defects in polytene chromosome architecture and a mild decrease in overall DNA content. They then turn to the salivary gland (SG) to further characterize the phenotypes associated with NudC knockdown. First, they show that an endogenously tagged version of NudC is abundant in the cytosol and has very weak nuclear staining in the region of the nucleolus (marked by the very low levels of DAPI staining). Knockdown of NudC using RNAi results in reduced NudC-GFP staining, a reduction in SG size, and a reduction in nuclear size. They also find that the SG polytene chromosomes are abnormal and that the production of a SG glue protein as measured by Sgs3-GFP levels and electron dense secretory granules is significantly reduced with NudC knockdown. Interestingly, they also observe the presence of abundant virus-like particles in the nucleus (these structures are thought to originate from retrotransposons and are an indicator of stress). Consistent with increased cellular stress, the authors show activation of JNK signalling. Ultrastructural analysis reveals an abnormally organized ER with an apparent loss of ER-associated ribosomes. They do see other electron dense structures in the cytosol, which they provide evidence (see below) of being P-bodies (structures associated with mRNA). They show that, consistent with a decrease in ribosomes, protein translation is reduced. This is supported by FISH experiments where they show significant decreases in ribosomal RNA (rRNA) transcript levels and decreased translation. Seeing the significant decreases in rRNA levels prompted them to look at overall changes in gene expression, where they discovered that both ribosomal protein gene expression as well as expression of other genes involved in ribosome biogenesis (RiBi) are upregulated with knockdown of NudC. They confirm the changes in mRNA for two genes by showing that levels of the corresponding proteins are also upregulated based on immunostaining of SG cells in which NudC is knocked down. Linking NudC function to a response to defects in RiBi, they shown that SG knockdown of several ribosomal biogenesis factors (RBFs) have similar chromosome structural defects and result in an increase in expression of ribosomal protein genes and of NudC itself. Finally, they show that knock down of genes encoding proteins linked to NudC function in microtubule dynamics do not have any of the same phenotypes as knockdown of NudC and RBFs. Altogether, their data support a moonlighting function for NudC in ribosome biogenesis. Moreover, defects in RiBi wherein ribosomal RNAs are decreased seem to result in compensatory changes where both RBFs and ribosomal protein genes are upregulated.

      Major issues:

      The title is a bit problematic since they haven't shown that NudC doesn't also affect normal mitotic cells - they only look at polyploid cells, but that doesn't mean normal mitotic cells are not also affected.

      Also, the authors show that two different RNAi lines for NudC give the same defects - it would be good to know if the RNAi lines target the same or different sequences in the NudC transcripts. Alternatively, it would be equally good to show that trans-allelic combinations of NudC mutants have the same defects in the prothoracic glands and the salivary glands as the RNAi. Instead, they examine only overall body size, developmental delays and lethality in the trans-hetero allelic NudC mutants.

      Results: Lines 261 - 266. Seeing electron dense structures in TEMs and seeing increased Me31B staining by confocal imaging in the cytoplasm is insufficient evidence that the electron dense structures are P-bodies. They could be the P-bodies but they could also be aggregated ribosomes; there is insufficient evidence to "confirm" that they are P-bodies - maybe just say "suggests".

      It would be quite helpful to characterize the "5 blob" and "shortened polytene chromosome arm" defects shown in Figure 2 and Figure 6. Are these partially polytenized chromosomes or are large sections of the chromosomes missing or just underreplicated? What do the chromosomes look like if you lyse the nuclei, spread the chromosomes and stain with DAPI or Hoechst - this is a pretty standard practice and would reveal much more about the structure of the polytene chromosomes.

      Minor points:

      Abstract, lines 28 - 31. I think this gene has been identified before. The authors probably want to say they have discovered a role for this gene in RiBi.

      Introduction, line 66. The protein is imported into the nucleus, where it localizes to the nucleolus - technically the protein is not imported into the nucleolus.

      Introduction, line 70. To be comprehensive in the description of ribosome biogenesis, the authors may want to mention that the 40S and 60S subunits are then exported from the nucleus and form the 80S subunit in the cytoplasm during translation.

      Introduction, line 98. May want to cite paper showing that Minute mutations turn out to be mutations in individual ribosomal protein genes.

      Results, lines 285 to 298. In situs with multiple probes that detect all parts of both the pre-rRNA and processed rRNA indicate that all are down in the SG in NudC knockdowns, but that the 18S and 28S rRNAs are down the internal transcribed spacers go up - can the authors explain or hypothesize how this could happen?

      Results, lines 292. Since they didn't knock down NudC in the fat body cells in this experiment, this comment seems irrelevant.

      Discussion, line 468. I don't think the authors have provided evidence of DNA damage. With the experiments they have shown, the chromosomes look abnormal - not clear what is abnormal.

      Figure 6A. Hoechst is misspelled.

      Referee cross-commenting

      I think the other reviewers have valid criticisms. I think among the most critical issues to sort out is (1) what is wrong with the chromosomes, (2) are diploid tissues also affected, (3) are the RIBI phenotypes a primary or secondary consequence of nudC loss. I'm not sure how easy it is to do ribosomal profiling on tissues dissected from larvae as the third reviewer is suggesting.

      Significance

      It is a novel discovery that a protein regulating microtubule dynamics is moonlighting, presumably in the nucleolus, to regulate rRNA synthesis or stabilization. A little information regarding mechanism of action would make this a much more exciting paper - how does it do it? Right now, it is unclear whether rRNA synthesis or maintenance is being regulated and there are no hypotheses regarding how this protein localizes to nucleoli and exactly what it is doing there. Is it regulating all RNA Pol I-dependent transcription? Is it involved in processing or stabilizing rRNAs? The description of the chromosomal defects also fall short of satisfying. As is, this paper probably of most interest to those who study ribosome biogenesis - an important topic, but without more mechanistic insight, not so interesting to a more general audience.

      My expertise

      I am an experienced Drosophila biologist who is familiar with the system and who fully understands all of the experiments presented in this manuscript and the relevance of the findings.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      *We thank the reviewers for their valuable comments. A common suggestion by all reviewers was that the manuscript would benefit from restructuring. Following their recommendation we have restructured this manuscript to improve its readability. *

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ The paper from Louka et al. studies the function of Cep104 during the development of Xenopus embryos. They perform overexpression and knock down experiments and address the consequences on neural tube closure, on ciliogenesis, and MT stability and on apical intercalation. There is a lot of data presented on a wide range of topics. While the data on MTs tracks reasonably well with other reports on Cep104, there are some concerns regarding the quality of some of the data and the interpretations based on the experimental results.

      Specific Points: It is difficult to assess the effect on apical constriction with the data provided. Please show zoomed in higher mag images. Also this should be coupled with a quantification of cell number and proliferation rates, as it is possible that Cep104 mildly affects proliferation / cell division which could affect cell size. Overall this experiment is not really addressing apical constriction since there is no before and after data. Lots of things could affect apical surface area, most notably proliferation rates which one might predict would be affected by subtle changes to MT dynamics.

      __Response: __Following the reviewer's recommendation we now show zoomed in higher magnification images to more clearly demonstrate the larger cell surface area in the morpholino injected neural plate compared to the control non-injected side in the same embryo. We agree with the reviewer that defects in cell proliferation could affect the cell size. If the effect of Cep104 on the cell surface area is caused by defects in cell proliferation, then we would expect this phenotype to persist in other tissues such as the ectoderm. However, we show that this phenotype is specific to the neural plate. On the other hand, if the cell surface area defect is caused by defects in apical constriction, we would expect this phenotype to be stage specific. Following the reviewer's recommendation, we compared the surface area of neuroectoderm cells before and after extensive apical constriction takes. The new data is shown in Figure S2. Our results show no difference in the surface area of neuroectoderm cells in control tracer injected and morpholino injected neuroepithelial cells at stage 13, before extensive apical constriction whereas significant differences are observed in stage 15 embryos during which cells undergo apical constriction. This data strengthens our conclusion that downregulation of Cep104 affects apical constriction.

      "This defect was rescued with expression of exogenous human CEP104-GFP mRNA (300pg mRNA) (Figure 1D-E)." This was partially rescued as the control and the rescue are significantly different.

      __Response: __We thank the reviewer for this important clarification. We edited the text to more clearly reflect our data.

      I am unclear what is being depicted in Figure 1F and G. What is the intense red staining? Is that the blastopore? Which would imply that the stage of analysis is quite different between C and F which is concerning. The same stages should be used.

      __Response: __This is an image of the anterior most region of a stage 15 embryo. Occasionally some embryos do display intense phalloidin staining at the neural plate. We replaced the image with a more clear one and moved this data to Figure S2C.

      S1A has a boxed region as if there was going to be a zoomed in image, but there is not. It would be nice to see it zoomed in. While the localization is indeed at the base and tips of cilia the base looks too dispersed and big to be the basal body?

      __Response: __Following the reviewer's recommendation we now show a zoomed in image of a primary cilium. The boxed area in figure S2A shows the cilium that was used to generate the fluorescence intensity profile plot shown in S2B. The Cep104 signal at the basal body is much stronger compared to the ciliary tip signal. Exposure that allows simultaneous detection of both the base and the tip signal results in overexposure of the signal at the base. This is consistent with observations in primary cilia in cell culture (please refer to Figure 4 in Frikstad et al. 2019 and Figure 3 in Yamazoe et al 2020).

      In other systems the depletion of Cep104 decreases primary cilia length. While the authors claim that neural tube cilia are normal there is no quantification to support that and the provided image is hard to assess.

      __Response: __Following the reviewer's recommendation we now show quantifications of the length of floor plate cilia (Figure S3C). Floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only, in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems.

      While the authors claim broad expression in humans and MO effects in cells without cilia, there is little data supporting the expression of Cep104 in the Xenopus cells being assayed (e.g. goblet cells).

      __Response: __We agree with the reviewer that there is little evidence supporting the expression of Cep104 in Xenopus goblet cells. Cep104 is a very low abundance protein and thus very difficult to detect it at endogenous levels For example, Ryniawec et al. (2023) raised an antibody against Drosophila Cep104 that failed to detect the native (endogenous) protein via western blot or immunofluorescence, but successfully recognized the overexpressed (transgenic) Cep104. A proteomic study by Peshkin et al. 2019 showed that Cep104 levels remain relatively constant throughout Xenopus development suggesting that this protein is expressed ubiquitously. This data is shown in Figure 4 where we plot the relative expression levels of Cep104 along with two motile cilia specific genes: hydin and RSPH9.

      The data in Figure 2 regarding the explants is difficult to understand and I think missing some key data. The text refers to the level of Gli increasing in the BF injected explants compared to uninjected explants, but the presentation of that is odd as the levels are normalized against uninjected rather than directly compared. And there are no stats for this key experiment. However, I think a bigger concern is the lack of information regarding the presence of cilia. While elongation and Sox2 expression are important they don't address if this tissue is similar to the neural tube in terms of cilia which is key to the interpretations.

      __Response: __Following the reviewer's recommendation we changed the presentation of this data. GLI1 levels are now normalized to XBF2 injected explants. The results are the same, Gli1 levels are 25% lower in morphant XBF2 explants (ttest pWe understand the reviewer's concern regarding the presence of cilia in the explants. To our knowledge there are currently no reports on the presence of cilia in the neural ectoderm in Xenopus. We have made several attempts to determine if cilia are present in this tissue during neurulation. However, we have not been able to detect cilia based on immunofluorescence staining for acetylated tubulin and Arl13b in the neural ectoderm. We conclude from this experiment that downregulation of Cep104 negatively affects hedgehog signaling and it remains to be addressed whether this is due to defects in primary cilia.

      The localization of Cep104 GFP in the epidermis and the neuroepithelium does not look similar as stated. Ones does not really see the punctate pattern in the neuroectoderm.

      Response: We thank the reviewer for pointing this out. To more clearly present this data we now show a plot of the fluorescence profile of Cep104-GFP along cell-cell junctions to demonstrate the punctate localization in the neuroepithelium.

      The experiments linking Cep104 to the tips of paused MTs is not particularly convincing. The depolymerization of MTs with nocodazole, will decrease all MTs as well as MT trafficking which could affect Cep104. Comparing this experiment with taxol treatment to stabilize MTs (and decrease dynamics) would be more convincing. Plus the image provided does not support the claim that the leftover EMTB is marked with Cep104.

      __Response: __Following the reviewer's recommendation we have examined the effect of taxol on the density of Cep104 apical puncta. We injected embryos with CEP104-GFP and EMTB-scarlet and exposed them to 20 μm taxol and imaged them live at stage 38. Embryos non treated with taxol served as the control. As shown in Figure S4 treatment with taxol led to an increase in the density of Cep104 puncta. This further supports our conclusion that Cep104 localizes to the ends of stable or paused microtubules. We also revised Figure 5 to more clearly show that Cep104 remains associated with the ends of nocodazole resistant EMTB labeled microtubules.

      The data in Figure 6 is very difficult to interpret / believe. The quantified effects on MTs are pretty subtle (which is fine...that is why you quantify), but the massive experimental variability questions the meaningfulness of those quantifications. In Fig 6B There are cells with lots of MTs right next to cells with no MTs and both have similar expression levels of Cep104. The staining just doesn't look consistent enough to accurately quantify. Also the effect of Nocodozole on MT stability is quite rapid, on the order of seconds to minutes, it is unclear what ON treatment with nocodazole would even be measuring since in that time there would be lots of secondary effects.

      __Response: __We thank the reviewer for this comment. Some cells in the epidermis lack apical microtubules as the reviewer correctly points out. Cells without strong apical microtubule staining are seen in both control and morpholino injected cells. Here we quantified the number of control and morphant cells per embryo that lack apical microtubules (DMSO treated embryos). Our results show that similar numbers of control and morphant cells per embryo appear to lack apical microtubules. We think that the heterogeneity in tubulin signal is not an artifact of immunofluorescence staining since these cells are adjacent to cells with clear tubulin staining. Although the source of this variability is still unknown, the fact that an equal number of control and morphant cells show this phenotype suggests that this is unlikely to be linked to the injections or drug treatment. Those cells were excluded from the quantifications shown in Figures 6C and 6D It is possible that these cells are preparing to enter mitosis.

      We think that the reviewer refers to the acute effects of nocodazole seen in cell cultures. However, in Xenopus tadpoles we didn't observe any effect on microtubules after short nocodazole treatment at low temperatures.

      The authors propose that overexpressing Cep104 would lead to stabilized MTs which is a reasonable hypothesis, however, they test this in multiciliated cells that already have a ton of acetylated MTs. If their hypothesis is correct it should lead to an increase in acetylated tubulin in non multiciliated cells which don't have much to begin with. This would be a marked improvement as the side projection quantification seems a little suspect as the analysis requires a precises ROI that eliminates the strong cilia acetylation staining. While I believe that could be done, the image provided looks as if it might cut off some of the apical surface which highlights the challenge.

      __Response: __Following the reviewer's recommendation, we examined the effect of Cep104 overexpression in non-MCCs on Xenopus epidermis. We show in Figure 7 that overexpression of Cep104 leads to a significant increase in the levels of acetylated tubulin in the cytoplasm of non-MCCs. We also show that overexpression of GFP alone did not have an effect on microtubule acetylation (Figure S5A). We moved the data on the cytoplasmic levels of acetylated microtubules in MCCs to figure S5B. We would like to clarify that the ROI to mark the cell body of MCCs was drawn right below the apical phalloidin signal to ensure that no signal derived from motile cilia will be included in the quantifications. A more detailed explanation of the quantification methods is included in this revised manuscript.

      Minor: Overall the color choice of images does not conform to the color blind favorable options that are becoming standard in the field. Also to the extent possible the colors should be consistent (e.g. Fig 4 A Cep104-GFP is green but in B it is red).

      __Response: __We thank the reviewer for this comment. We have changed the color choices in the figures to conform to the color blind.

      The recent Xenopus Cep104 paper was referenced with two references, and the wording of those two sentences was redundant.

      __Response: __We thank the reviewer for this comment. We edited the text accordingly.


      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ This study by Louka et al., investigates the function of Cep104, a protein associated with Joubert syndrome, in Xenopus. Several aspects are studied at different scales. Loss of function of this protein suggests a role in neural tube closure, apical constriction, and HH signaling. Moving on in the study, the authors investigate the localization of Cep104 in the primary cilia of the neural tube before focusing on its localization in multiciliated cells. They then look at the consequences of loss of function on motile cilia and conclude that it plays a role in the length of the distal segment. They then show an association of Cep104 with cytoplasmic microtubules in non-multiciliated cells of the Xenopus epidermis. They then analyze the function of Cep104 on these microtubules and show that loss of Cep104 function increases the speed of EB1 comets. They then looked at the impact of loss of function on microtubule stability and finally the impact of gain of function. Finally, they returned to the multiciliated cells and described an intercalation defect that correlated with decreases in acetylated tubulin. I think that certain controls are missing and that the choice of illustrations should be reconsidered (better quality, appropriate zoom). In terms of form, the text is not easy to read and the manuscript would benefit from reformatting to highlight the logical links between the different experiences and avoid a catalog-like effect. I would advise the authors to revise their introduction to make it less disjointed and guide readers toward the questions addressed by the manuscript.

      Response: We thank the reviewer for the constructive criticism. We have revised the introduction to make it easier to read.

      Below are specific comments and remarks: Figure 1: Why the conclusion is a "delay" in neural tube closure? At what stage is this analyzed? Is there a recovery of NT closure at later stage? A: I would suggest to provide control picture of non-injected and tracer only injected embryos. B: Statistics are missing on the graph D: mention what was injected instead of "+ rescue". Close up picture would allow a better appreciation of the differences in surface area.

      Response: We thank the reviewer for this comment. The image shown in Figure 1A is from late neurula embryos, stage 18. We conclude that it is a delay in neural tube closure because the neural tube does close and the embryos develop to tailbud stages. To demonstrate the delay in neural tube closure we now include a time lapse sequence of a neurula stage embryo injected with the morpholino unilaterally which shows that the morpholino injected side moves towards the midline slower compared to the control uninjected side (movie 1). We also included a representative image of the dorsal side of a tailbud embryo injected unilaterally with the CEP104 morpholino to show that the neural tube has closed and the embryos develop to tailbud stages (figure S1D).

      Following the reviewer's recommendation, we also show images of embryos injected unilaterally with the tracer alone (Figure S2), we included the statistical analysis for graph 1D, revised image 1D to show that the embryo is injected with the morpholino and CEP104-GFP and provide close ups to allow for better appreciation of the differences in surface area.

      Figure S1: To illustrate the claim that cilia are not affected, it would be good to show injection of tracer alone and compare to tracer + morpholino. Also, to provide a measure of the cilia size.

      __Response: __Following the reviewer's recommendation we quantified the length of floor plate cilia in the neural tube of control and morpholino injected embryos. As explained in our response to a comment by reviewer 1, the floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems (Figure S3C).

      Figure 2: Please provide pictures to illustrate graph D.


      __Response: __The graph in Figure 2D shows RT-qPCR results for CEP104 in BF2 and BF2 and morpholino injected explants as compared to non-injected explants. We do not have a working antibody that would allow us to show the downregulation at the protein level.

      Figure 5: "Interestingly, most of the nocodazole-resistant stable microtubules were positive for Cep104 (Figure 5C, arrows). " The variation in density of Cep104-GFP signal is not visible on the pictures provided in C. I would suggest to show higher magnifications. Also, in the DMSO treated picture the Cep104GFP signal looks really different when compared to Cep104-GFP signal shown in B. Arrows should be reported on all channels. However, it not clear what we should see with this arrows. 5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition. The basal body signal was not seen in the figure 3A which analyze the localization of Cep104-GFP in MCCs. Why not comment on this? Is it a phenotype on MCCs ?

      Response: __Following the reviewer's recommendation, we now show higher magnifications of the images shown in Figure 5C. We removed the arrows as most reviewers found them confusing. To demonstrate the presence of Cep104 at the ends of nocodazole resistant EMTB labeled microtubules we show zoomed images and a representative fluorescence intensity profile plot. __Figure 5B shows an image of a non-MCC whereas Figure 5C shows a larger area on the tadpole epidermis which includes both MCCs and non-MCCs. We thank the reviewer for pointing out that the localization of Cep104 in 5C looks different from 3A. We do not think this is a phenotype on MCCs. In Figure 3A we imaged only the tips of cilia which is why it looks different from 5C in which we imaged the apical surface of the cells as well. We disagree with the reviewer regarding the comment '5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition'. The basal body localization of Cep104 is shown in the DMSO image as well. We hope that it will be clear in this revised figure.

      Figure 6: Intriguingly, morphant non-MCCs have significantly more mean β-tubulin signal compared to control non-MCCs in embryos treated with DMSO (Figure 6C). impossible to appreciate on the figures. Please specify on the figure what is considered as a morphant non-MCC versus a control non-MCC. The membrane-cherry positive cells (supposedly morphant? it has to be clarified show very heterogenous tubulin expression) If the point here is to show that microtubules are more sensitive to nocodazole in morphant cells as compared to control. I would suggest to show all conditions on a same graph. At least annotate more the graph for a self-explanatory figure (DMSO , Nocodazole).

      __Response: __We agree with the reviewer that it impossible to appreciate the difference in β-tubulin signal between control and morphant non-MCCs. Based on the quantifications of mean β-tubulin fluorescence intensity there is 5% difference in the fluorescence intensity between the two groups. Statistical analysis using t-test shows that although very small, this difference is statistically significant which is why we mention it in the manuscript. We have removed this statement and data from the revised manuscript because this is a very subtle phenotype, and it is beyond the scope of this experiment.

      Following the reviewer's recommendation, we clarify that mem-cherry positive cells contain the morpholino and mem-cherry negative cells are the control cells. We marked with a white asterisk the morphant non-MCCs. To address the heterogenous tubulin levels we provide quantifications which show that a similar number of control and morphant cells appear to lack microtubules. We think that the heterogeneity in tubulin signal is not an artifact of immunofluorescence staining since these cells are adjacent to cells with clear tubulin staining. Although the source of this variability is still unknown, the fact that an equal number of control and morphant cells show this phenotype suggests that this is unlikely to be linked to the injections or drug treatment. Those cells were excluded from the quantifications shown in figure 6. It is possible that these cells are preparing to enter mitosis. The reviewer is correct; the point of this experiment is to examine the effect of Cep104 downregulation on the sensitivity of microtubules to nocodazole. To more clearly present the results of this experiment we normalize the β-tubulin fluorescence Intensity in morphant cells to the one in control cells in the same embryo and we compare the normalized intensity in DMSO and nocodazole treated embryos.

      Figure 7: Statistics are missing on Graph B

      __ ____Response: __Following the reviewer's recommendation, we added the statistics on the graph.

      Comment on the text: "Cep104 signal shows the characteristic two dot pattern in motile cilia (Figure 3A) that was also observed in a recent study using Xenopus Cep10465 and in the cilia of Tetrahymena50. This is in agreement with a recent study showing the characteristic two dot pattern for Xenopus Cep104 as well66 " ref 65 and 66b are the same (Hong et al., preprint)

      __ ____Response: __We thank the reviewer for pointing this out. We edited the text to avoid repetition and corrected the references.

      "This data suggests that downregulation of CEP104 affects the stability of cytoplasmic microtubules." I would suggest a more precise conclusion by stating how is it affected? More stable? Less stable? Important for the follow-up demonstration.

      __ _Response: _We edited the text according to the reviewer's recommendation to precisely conclude that downregulation of Cep104 makes cytoplasmic microtubules less stable. __

      Movies: Please annotate properly movie 2 and 3 so the reader can know what he/she is looking.


      __Response: __Following the reviewer's comment, we revised the movie annotations to help the reader know what they are looking.


      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ The manuscript entitled "Ciliary and non-ciliary functions of Cep104 in Xenopus" by Louka et al. investigate roles for the centriole and cilia tip protein Cep104 in Xenopus embryos. The authors show that depletion of Cep104 prevents neural tube closure due to inefficient apical constriction of neural cells and defective hedgehog signaling. Cep104 depletion also resulted in structural and functional ciliary defects in multi-ciliated cells. Surprisingly, the authors discover a role for Cep104 in stabilizing cytoplasmic microtubules in non-ciliated and multi-ciliated cells. Reduced microtubule stability in Cep104-depleted cells correlated with reduced apical intercalation of multi-ciliated cells in the epidermis.

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

      Response: We thank the reviewer for their comments. We tried to address this by restructuring the manuscript to describe the results in more detail within a normal developmental context.

      Major Critiques: The developmental context of experiments is not made clear. The authors use different tissues at varying developmental stages to perform experiments. However, these findings are not explored in depth and, therefore, the manuscript does not advance our understanding of Cep104's role in any of the processes explored.

      __ ____Response:__ We thank the reviewer for their comment. We took advantage of different tissues during Xenopus development to understand the cellular and molecular function of this protein in vivo. In this manuscript we show that Cep104 is involved in neural tube closure likely through its effect on apical constriction. Our data show that Cep104 is important for the stability of cytoplasmic microtubules and this is further demonstrated through its role in apical intercalation of multiciliated cells, a process known to depend on stable microtubules. Although our data do not advance our understanding on developmental processes such as apical constriction and MCC apical intercalation, they do improve our understanding of how Cep104 impacts cytoplasmic microtubules which has not been addressed in vivo yet.

      While the potential role of Cep104 in cytoplasmic microtubule regulation is intriguing, the experiments in the manuscript do not directly test this function. Because Cep104 depletion appears to have a profound developmental effect, it is difficult to interpret changes to EB1 velocity as directly attributed to Cep104 function. Additionally, the only evidence for Cep104 localization occurs in cells overexpressing human Cep104. The authors must directly visualize endogenous Cep104 to conclude microtubule or membrane localization, which they can also use to demonstrate Cep104 depletion in the morpholino experiments. Additionally, the assertion that Cep104 is binding plus-ends of cytoplasmic microtubules is not experimentally supported.

      __ ____Response: __Unfortunately, we cannot directly visualize endogenous Cep104 because there is no commercially available antibody that works in Xenopus. Cep104 is a very low abundance protein, and this is highlighted in the study by John M.Ryniawec et al. 2023, where they generated an antibody against the drosophila Cep104 which detected the GFP-tagged DmCep104 but failed to detect the endogenous protein. Given that the ciliary and basal body signal of Cep104 represents the cumulative signal from nine microtubules, one can appreciate the difficulty of observing the Cep104 signal in individual microtubules. None of the commercially available Cep104 antibodies that we have tested worked against the Xenopus protein in immunofluorescence or western blot experiments. We agree with the reviewer that we do not experimentally test the binding of Cep104 to the microtubule plus-end. This has been demonstrated by others. In Jiang et al. 2012 it was showed that GFP-Cep104 co-immunoprecipitates with GST-EB1 but not with GST-EB1 that lacks the tail which contains the SxIP binging motif. In Yamazoe et al. 2020 study it was shown that exogenous Cep104 co-immunoprecipitates with exogenous EB1 and Cep104 with mutated SxIP motif (SKNN) fails to co-immunoprecipitate with EB1. This shows that Cep104 interacts with EB1 through its SxIP motif. In addition, overexpression of Cep104 recruits Cep97 to microtubule tips suggesting that it acts as a +TIP protein. A recent study by Saunders et al. 2025 showed that in in vitro microtubule reconstitution assays, Cep104 could not autonomously bind the microtubule plus-end at low concentrations but in the presence of EB3 it could bind the microtubule plus-end and block microtubule polymerization at the same low concentration. This shows that Cep104 interacts with EB3, localizes to the microtubule plus-end and affects its dynamics in vitro. We added this information in the manuscript to more clearly show that the interaction of Cep104 and EB proteins is well documented. We anticipate that this interaction will hold true in all cell types where the two proteins are co-expressed.

      Additional Critiques: Figure S1. I only see the emergence of a shorter product after Cep104 depletion. Should PCR using Exon5-7 still work in successful knockdown? If not, then it is unclear what was quantified to determine Cep104 depletion as morpholino bands appear no different than control.

      __ ____Response: __We thank the reviewer for this comment. PCR using exon5-7 will not work when splice blocking by the morpholino takes place. This is a knockdown approach and the efficiency of the morpholino is about 90%. Upon completion of the RT-qPCR cycle the samples were analyzed by gel electrophoresis to demonstrate that 1) alternative splicing took place (see two products with exon 3-7 primers) and 2) the presence of a single product for all primer sets used.

      Figure 1A. Is this an example of an open or closed NTC? Show data used to determine the statement "no difference during convergent extension".

      __ ____Response: __This is an example of an embryo that was unilatterally injected with the morpholino. The left side is the control non-injected side and the right side is the morpholino injected. We added this information on the figure to make it more self-explanatory. In Figure 2 the elongation of the BF2 injected explants is due to convergent extension. The statement "no difference during convergent extension" was removed from the revised manuscript.

      Figure S2C. What does "Does not effect formation of cilia" mean? Does Cep104 depletion does not effect number, length, etc? Show quantitation used to determine this?

      __ ____Response:__ Following the reviewer's recommendation, we quantified the length of floor plate cilia in control and morpholino injected embryos. As mentioned in our response to reviewer 1 and 2, floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only, in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems.

      Figure 5B. Along with strong Cep104 localization to membranes, there also appears to be strong EMTB localization. Is this also present in beta-tubulin immunostaining? Are these localizing to a cortical population of microtubules or to the membrane?

      __ ____Response: __We thank the reviewer for their comment. The Cep104 puncta at the cell periphery, are reduced/lost upon nocodazole treatment thus we conclude that Cep104 localizes to microtubules and not the cell membrane (Figure 5C, zoomed images). Of course, we cannot exclude the possibility that microtubules are required to target CEP104 to the plasma membrane. We edited the text to clearly state this conclusion.

      Figure 6C and 6D. These two panels have the same labels. The authors should denote that 6D is in nocodazole-treated explants.

      __ ____Response:__ We thank the reviewer for this comment. We edited this figure to more clearly present the results of this experiment: We normalized the β -tubulin levels in morphant cells to that of control cells in the same embryo (mosaic morphant embryos were used in this experiment). The graph shows the mean normalized β -tubulin levels per embryo treated with DMSO or nocodazole.

      Figure 7. What are Cep104 levels at stage 18-19?

      __ ____Response: __Following the reviewer's comment we now show the Cep104 protein expression levels during Xenopus development as reported on Xenbase (Figure 4). Cep104 is expressed at low levels from gastrulation to tailbud stages (Figure 4D).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript entitled "Ciliary and non-ciliary functions of Cep104 in Xenopus" by Louka et al. investigate roles for the centriole and cilia tip protein Cep104 in Xenopus embryos. The authors show that depletion of Cep104 prevents neural tube closure due to inefficient apical constriction of neural cells and defective hedgehog signaling. Cep104 depletion also resulted in structural and functional ciliary defects in multi-ciliated cells. Surprisingly, the authors discover a role for Cep104 in stabilizing cytoplasmic microtubules in non-ciliated and multi-ciliated cells. Reduced microtubule stability in Cep104-depleted cells correlated with reduced apical intercalation of multi-ciliated cells in the epidermis.

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

      Major Critiques:

      The developmental context of experiments is not made clear. The authors use different tissues at varying developmental stages to perform experiments. However, these findings are not explored in depth and, therefore, the manuscript does not advance our understanding of Cep104's role in any of the processes explored.

      While the potential role of Cep104 in cytoplasmic microtubule regulation is intriguing, the experiments in the manuscript do not directly test this function. Because Cep104 depletion appears to have a profound developmental effect, it is difficult to interpret changes to EB1 velocity as directly attributed to Cep104 function. Additionally, the only evidence for Cep104 localization occurs in cells overexpressing human Cep104. The authors must directly visualize endogenous Cep104 to conclude microtubule or membrane localization, which they can also use to demonstrate Cep104 depletion in the morpholino experiments. Additionally, the assertion that Cep104 is binding plus-ends of cytoplasmic microtubules is not experimentally supported.

      Additional Critiques:

      Figure S1. I only see the emergence of a shorter product after Cep104 depletion. Should PCR using Exon5-7 still work in successful knockdown? If not, then it is unclear what was quantified to determine Cep104 depletion as morpholino bands appear no different than control.

      Figure 1A. Is this an example of an open or closed NTC? Show data used to determine the statement "no difference during convergent extension".

      Figure S2C. What does "Does not effect formation of cilia" mean? Does Cep104 depletion does not effect number, length, etc? Show quantitation used to determine this?

      Figure 5B. Along with strong Cep104 localization to membranes, there also appears to be strong EMTB localization. Is this also present in beta-tubulin immunostaining? Are these localizing to a cortical population of microtubules or to the membrane?

      Figure 6C and 6D. These two panels have the same labels. The authors should denote that 6D is in nocodazole-treated explants.

      Figure 7. What are Cep104 levels at stage 18-19?

      Significance

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study by Louka et al., investigates the function of Cep104, a protein associated with Joubert syndrome, in Xenopus. Several aspects are studied at different scales. Loss of function of this protein suggests a role in neural tube closure, apical constriction, and HH signaling. Moving on in the study, the authors investigate the localization of Cep104 in the primary cilia of the neural tube before focusing on its localization in multiciliated cells. They then look at the consequences of loss of function on motile cilia and conclude that it plays a role in the length of the distal segment. They then show an association of Cep104 with cytoplasmic microtubules in non-multiciliated cells of the Xenopus epidermis. They then analyze the function of Cep104 on these microtubules and show that loss of Cep104 function increases the speed of EB1 comets. They then looked at the impact of loss of function on microtubule stability and finally the impact of gain of function. Finally, they returned to the multiciliated cells and described an intercalation defect that correlated with decreases in acetylated tubulin. I think that certain controls are missing and that the choice of illustrations should be reconsidered (better quality, appropriate zoom). In terms of form, the text is not easy to read and the manuscript would benefit from reformatting to highlight the logical links between the different experiences and avoid a catalog-like effect. I would advise the authors to revise their introduction to make it less disjointed and guide readers toward the questions addressed by the manuscript.

      Below are specific comments and remarks:

      Figure 1:

      Why the conclusion is a "delay" in neural tube closure? At what stage is this analyzed? Is there a recovery of NT closure at later stage? A: I would suggest to provide control picture of non-injected and tracer only injected embryos. B: Statistics are missing on the graph D: mention what was injected instead of "+ rescue". Close up picture would allow a better appreciation of the differences in surface area.

      Figure S1:

      To illustrate the claim that cilia are not affected, it would be good to show injection of tracer alone and compare to tracer + morpholino. Also, to provide a measure of the cilia size.

      Figure 2:

      Please provide pictures to illustrate graph D.

      Figure 5:

      "Interestingly, most of the nocodazole-resistant stable microtubules were positive for Cep104 (Figure 5C, arrows). " - The variation in density of Cep104-GFP signal is not visible on the pictures provided in C. I would suggest to show higher magnifications. Also, in the DMSO treated picture the Cep104GFP signal looks really different when compared to Cep104-GFP signal shown in B. Arrows should be reported on all channels. However, it not clear what we should see with this arrows. 5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition. The basal body signal was not seen in the figure 3A which analyze the localization of Cep104-GFP in MCCs. Why not comment on this? Is it a phenotype on MCCs ? Figure 6: Intriguingly, morphant non-MCCs have significantly more mean β-tubulin signal compared to control non-MCCs in embryos treated with DMSO (Figure 6C). - impossible to appreciate on the figures. Please specify on the figure what is considered as a morphant non-MCC versus a control non-MCC. The membrane-cherry positive cells (supposedly morphant? it has to be clarified show very heterogenous tubulin expression)

      If the point here is to show that microtubules are more sensitive to nocodazole in morphant cells as compared to control. I would suggest to show all conditions on a same graph. At least annotate more the grap for a self-explanatory figure (DMSO , Nocodazole). Figure 7: Statistics are missing on Graph B Comment on the text: "Cep104 signal shows the characteristic two dot pattern in motile cilia (Figure 3A) that was also observed in a recent study using Xenopus Cep10465 and in the cilia of Tetrahymena50. This is in agreement with a recent study showing the characteristic two dot pattern for Xenopus Cep104 as well66 " - ref 65 and 66b are the same (Hong et al., preprint)

      "This data suggests that downregulation of CEP104 affects the stability of cytoplasmic microtubules." - I would suggest a more precise conclusion by stating how is it affected? More stable? Less stable? Important for the follow-up demonstration.

      Movies:

      Please annotate properly movie 2 and 3 so the reader can know what he/she is looking.

      Referees cross-commenting

      Similar feeling that reviews are consistent

      Significance

      This study investigates the role of the proprotein Cep104 in Xenopus. Cep104 is a protein associated with Joubert syndrome, whose role in primary cilia has been extensively documented. While its localization at the tip of motile cilia has also been reported, this study provides functional evidence for the role of Cep104 in motile cilia. In addition, the study looks at the role of Cep104 on non-cilial microtubules, which is the original aspect of the paper and may ultimately lead to a better understanding of Joubert syndrome. However, I believe that the evidence provided (controls, illustrations) needs to be improved. This paper will be of interest to a specialized audience with an interest in proteins associated with cilia and microtubules.

      I am a cell biologist specialized in the study of multiciliated cells using advanced imaging methods and Xenopus and mice as models. I believe my expertise was a perfect match for this manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper from Louka et al. studies the function of Cep104 during the development of Xenopus embryos. They perform overexpression and knock down experiments and address the consequences on neural tube closure, on ciliogenesis, and MT stability and on apical intercalation. There is a lot of data presented on a wide range of topics. While the data on MTs tracks reasonably well with other reports on Cep104, there are some concerns regarding the quality of some of the data and the interpretations based on the experimental results.

      Specific Points:

      It is difficult to assess the effect on apical constriction with the data provided. Please show zoomed in higher mag images. Also this should be coupled with a quantification of cell number and proliferation rates, as it is possible that Cep104 mildly affects proliferation / cell division which could affect cell size. Overall this experiment is not really addressing apical constriction since there is no before and after data. Lots of things could affect apical surface area, most notably proliferation rates which one might predict would be affected by subtle changes to MT dynamics.

      "This defect was rescued with expression of exogenous human CEP104-GFP mRNA (300pg mRNA) (Figure 1D-E)." This was partially rescued as the control and the rescue are significantly different.

      I am unclear what is being depicted in Figure 1F and G. What is the intense red staining? Is that the blastopore? Which would imply that the stage of analysis is quite different between C and F which is concerning. The same stages should be used.

      S1A has a boxed region as if there was going to be a zoomed in image, but there is not. It would be nice to see it zoomed in. While the localization is indeed at the base and tips of cilia the base looks too dispersed and big to be the basal body?

      In other systems the depletion of Cep104 decreases primary cilia length. While the authors claim that neural tube cilia are normal there is no quantification to support that and the provided image is hard to assess.

      While the authors claim broad expression in humans and MO effects in cells without cilia, there is little data supporting the expression of Cep104 in the Xenopus cells being assayed (e.g. goblet cells).

      The data in Figure 2 regarding the explants is difficult to understand and I think missing some key data. The text refers to the level of Gli increasing in the BF injected explants compared to uninjected explants, but the presentation of that is odd as the levels are normalized against uninjected rather than directly compared. And there are no stats for this key experiment. However, I think a bigger concern is the lack of information regarding the presence of cilia. While elongation and Sox2 expression are important they don't address if this tissue is similar to the neural tube in terms of cilia which is key to the interpretations.

      The localization of Cep104 GFP in the epidermis and the neuroepithelium does not look similar as stated. Ones does not really see the punctate pattern in the neuroectoderm.

      The experiments linking Cep104 to the tips of paused MTs is not particularly convincing. The depolymerization of MTs with nocodazole, will decrease all MTs as well as MT trafficking which could affect Cep104. Comparing this experiment with taxol treatment to stabilize MTs (and decrease dynamics) would be more convincing. Plus the image provided does not support the claim that the leftover EMTB is marked with Cep104.

      The data in Figure 6 is very difficult to interpret / believe. The quantified effects on MTs are pretty subtle (which is fine...that is why you quantify), but the massive experimental variability questions the meaningfulness of those quantifications. In Fig 6B There are cells with lots of MTs right next to cells with no MTs and both have similar expression levels of Cep104. The staining just doesn't look consistent enough to accurately quantify. Also the effect of Nocodozole on MT stability is quite rapid, on the order of seconds to minutes, it is unclear what ON treatment with nocodazole would even be measuring since in that time there would be lots of secondary effects.

      The authors propose that overexpressing Cep104 would lead to stabilized MTs which is a reasonable hypothesis, however, they test this in multiciliated cells that already have a ton of acetylated MTs. If their hypothesis is correct it should lead to an increase in acetylated tubulin in non multiciliated cells which don't have much to begin with. This would be a marked improvement as the side projection quantification seems a little suspect as the analysis requires a precises ROI that eliminates the strong cilia acetylation staining. While I believe that could be done, the image provided looks as if it might cut off some of the apical surface which highlights the challenge.

      Minor:

      Overall the color choice of images does not conform to the color blind favorable options that are becoming standard in the field. Also to the extent possible the colors should be consistent (e.g. Fig 4 A Cep104-GFP is green but in B it is red).

      The recent Xenopus Cep104 paper was referenced with two references, and the wording of those two sentences was redundant.

      Referees cross-commenting

      I feel that all three reviews are pretty consistent and I do not have any issues with the other reviews.

      Significance

      Strengths. Cep104 appears to be a hot topic right now as there are several papers in bioRXiv. I suspect that this led to a bit of a rushed submission. The other papers focus mostly on understanding the mechanisms of the ciliary roles of Cep104 which is well established. In other systems the broad phenotypes associated with Cep104 depletion are assumed to be through loss of cilia mediated HH signaling. This paper proposes a number of non ciliary roles for Cep104 which given its broad distribution could be relevant. If true these findings would add considerably to the field. Given that MTs do lots of things other than make cilia it would not be too surprising for Cep104 to have MT specific phenotypes as proposed here.

      Weaknesses. The quality of much of the data makes it difficult to assess the claims of broad importance. Key experiments critical to the interpretation of the data are lacking.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Authors’ reply (____Ono et al)

      Review Commons Refereed Preprint #RC-2025-03137

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      We greatly appreciate the reviewer’s supportive comments. The reviewer has accurately recognized our new findings concerning the collaborative roles of condensin II and cohesin in establishing and maintaining interphase chromosome territories.

      Major point:

      They propose a functional 'handover' from condensin II to cohesin, for the organization of CTs at the M-to-G1 transition. However, the 'handover', i.e. difference in timing of executing their functions, was not experimentally substantiated. Ideally, they can deplete condensin II and cohesin at different times to prove the 'handover'. However, this would require the use of two different degron tags and go beyond the revision of this manuscript. At least, based on the literature, the authors should discuss why they think condensin II and cohesin should work at different timings in the CT organization.

      We take this comment seriously, especially because Reviewer #2 also expressed the same concern. 

      First of all, we must admit that the basic information underlying the “handover” idea was insufficiently explained in the original manuscript. Let us make it clear below:

      • Condensin II bound to chromosomes and is enriched along their axes from anaphase through telophase (Ono et al., 2004; Hirota et al., 2004; Walther et al., 2018).
      • In early G1, condensin II is diffusely distributed within the nucleus and does not bind tightly to chromatin, as shown by detergent extraction experiments (Ono et al., 2013).
      • Cohesin starts binding to chromatin when the cell nucleus reassembles (i.e., during the cytokinesis stage shown in Fig. 1B), apparently replacing condensins I and II (Brunner et al., 2025).
      • Condensin II progressively rebinds to chromatin from S through G2 phase (Ono et al., 2013). The cell cycle-dependent changes in chromosome-bound condensin II and cohesin summarized above are illustrated in Fig. 1A. We now realize that Fig. 1B in the original manuscript was inconsistent with Fig. 1A, creating unnecessary confusion, and we sincerely apologize for this. The fluorescence images shown in the original Fig. 1B were captured without detergent extraction prior to fixation, giving the misleading impression that condensin II remained bound to chromatin from cytokinesis through early G1. This was not our intention. To clarify this, we have repeated the experiment in the presence of detergent extraction and replaced the original Fig. 1B with a revised panel. Figs. 1A and 1B are now more consistent with each other. Accordingly, we have modified the correspsonding sentences as follows:

      Although condensin II remains nuclear throughout interphase, its chromatin binding is weak in G1 and becomes robust from S phase through G2 (Ono et al., 2013). Cohesin, in contrast, replaces condensin II in early G1 (Fig. 1 B)(Abramo et al., 2019; Brunner et al., 2025), and establishes topologically associating domains (TADs) in the G1 nucleus (Schwarzer et al., 2017; Wutz et al., 2017)*. *

      While there is a loose consensus in the field that condensin II is replaced by cohesin during the M-to-G1 transition, it remains controversial whether there is a short window during which neither condensin II nor cohesin binds to chromatin (Abramo et al., 2019), or whether there is a stage in which the two SMC protein complexes “co-occupy” chromatin (Brunner et al., 2025). Our images shown in the revised Fig. 1B cannot clearly distinguish between these two possibilities.

      From a functional point of view, the results of our depletion experiments are more readily explained by the latter possibility. If this is the case, the “interplay” or “cooperation” rather than the “handover” may be a more appropriate term to describe the functional collaboration between condensin II and cohesin during the M-to-G1 transition. For this reason, we have avoided the use of the word “handover” in the revised manuscript. It should be emphasized, however, that given their distinct chromosome-binding kinetics, the cooperation of the two SMC complexes during the M-to-G1 transition is qualitatively different from that observed in G2. Therefore, the central conclusion of the present study remains unchanged.

      For example, a sentence in Abstract has been changed as follows:

      a functional interplay between condensin II and cohesin during the mitosis-to-G1 transition is critical for establishing chromosome territories (CTs) in the newly assembling nucleus.

      While the reviewer suggested one experiment, it is clearly beyond the scope of the current study. It should also be noted that even if such a cell line were available, the proposed application of sequential depletion to cells progressing from mitosis to G1 phase would be technically challenging and unlikely to produce results that could be interpreted with confidence.

      Other points:

      Figure 2E: It seems that the chromosome length without IAA is shorter in Rad21-aid cells than H2-aid cells or H2-aid Rad21-aid cells. How can this be interpreted? This comment is well taken. A related comment was made by Reviewer #3 (Major comment #2). Given the substantial genetic manipulations applied to establish multiple cell lines used in the present study, it is, strictly speaking, not straightforward to compare the -IAA controls between different cell lines. Such variations are most prominently observed in Fig. 2E, although they can also be observed to lesser extent in other experiments (e.g., Fig. 3E). This issue is inherently associated with all studies using genetically manipulated cell lines and therefore cannot be completely avoided. For this reason, we focus on the differences between -IAA and +IAA within each cell line, rather than comparing the -IAA conditions across different cell lines. In this sense, a sentence in the original manuscript (lines 178-180) was misleading. In the revised manuscript, we have modified the corresponding and subsequent sentence as follows:

      Although cohesin depletion had a marginal effect on the distance between the two site-specific probes (Fig.2, C and E), double depletion did not result in a significant change (Fig.2, D and E), consistent with the partial restoration of centromere dispersion (Fig. 1G).

      • *

      In addition, we have added a section entitled “Limitations of the study” at the end of the Discussion to address technical issues that are inevitably associated with the current approach.

      Figure 3: Regarding the CT morphology, could they explain further the difference between 'elongated' and 'cloud-like (expanded)'? Is it possible to quantify the frequency of these morphologies? In the original manuscript, we provided data that quantitatively distinguished between the “elongated” and “cloud-like” phenotypes. Specifically, Fig. 2E shows that the distance between two specific loci (Cen 12 and 12q15) is increased in the elongated phenotype but not in the cloud-like phenotype. In addition, the cloud-like morphology was clearly deviated from circularity, as indicated by the circularity index (Fig. 3F). However, because circularity can also decrease in rod-shaped chromosomes, these datasets alone may not be sufficiently convincing, as the reviewer pointed out. We have now included an additional parameter, the aspect ratio, defined as the ratio of an object’s major axis to its minor axis (new Fig. 3F). While this intuitive parameter was altered upon condensin II depletion and double depletion, again, we acknowledge that it is not sufficient to convincingly distinguish between the elongated and cloud-like phenotypes proposed in the original manuscript. For these reasons, in the revised manuscript, we have toned down our statements regarding the differences in CT morphology between the two conditions. Nonetheless, together with the data from Figs. 1 and 2, it is that the Rabl configuration observed upon condensin II depletion is further exacerbated in the absence of cohesin. Accordingly, we have modified the main text and the cartoon (Fig 3H) to more accurately depict the observations summarized above.

      Figure 5: How did they assign C, P and D3 for two chromosomes? The assignment seems obvious in some cases, but not in other cases (e.g. in the image of H2-AID#2 +IAA, two D3s can be connected to two Ps in the other way). They may have avoided line crossing between two C-P-D3 assignments, but can this be justified when the CT might be disorganized e.g. by condensin II depletion? This comment is well taken. As the reviewer suspected, we avoided line crossing between two sets of assignments. Whenever there was ambiguity, such images were excluded from the analysis. Because most chromosome territories derived from two homologous chromosomes are well separated even under the depleted conditions as shown in Fig. 6C, we did not encounter major difficulties in making assignments based on the criteria described above. We therefore remain confident that our conclusion is valid.

      That said, we acknowledge that our assignments of the FISH images may not be entirely objective. We have added this point to the “Limitations of the study” section at the end of the Discussion.

      Figure 6F: The mean is not indicated on the right-hand side graph, in contrast to other similar graphs. Is this an error? We apologize for having caused this confusion. First, we would like to clarify that the right panel of Fig. 6F should be interpreted together with the left panel, unlike the seemingly similar plots shown in Figs. 6G and 6H. In the left panel of Fig. 6F, the percentages of CTs that contact the nucleolus are shown in grey, whereas those that do not are shown in white. All CTs classified in the “non-contact” population (white) have a value of zero in the right panel, represented by the bars at 0 (i.e., each bar corresponds to a collection of dots having a zero value). In contrast, each CT in the “contact” population (grey) has a unique contact ratio value in the right panel. Because the right panel consists of two distinct groups, we reasoned that placing mean or median bars would not be appropriate. This was why no mean or median bars were shown in in the tight panel (The same is true for Fig. S5 A and B).

      That said, for the reviewer’s reference, we have placed median bars in the right panel (see below). In the six cases of H2#2 (-/+IAA), Rad21#2 (-/+IAA), Double#2 (-IAA), and Double#3 (-IAA), the median bars are located at zero (note that in these cases the mean bars [black] completely overlap with the “bars” derived from the data points [blue and magenta]). In the two cases of Double#2 (+IAA) and Double#3 (+IAA), they are placed at values of ~0.15. Statistically significant differences between -IAA and +IAA are observed only in Double#2 and Double#3, as indicated by the P-value shown on the top of the panel. Thus, we are confident in our conclusion that CTs undergo severe deformation in the absence of both condensin II and cohesin.

      Figure S1A: The two FACS profiles for Double-AID #3 Release-2 may be mixed up between -IAA and +IAA. The review is right. This inadvertent error has been corrected.

      The method section explains that 'circularity' shows 'how closely the shape of an object approximates a perfect circle (with a value of 1 indicating a perfect circle), calculated from the segmented regions'. It would be helpful to provide further methodological details about it. We have added further explanations regarding the circularity in Materials and Methods together with a citation (two added sentences are underlined below):

      To analyze the morphology of nuclei, CTs, and nucleoli, we measured “circularity,” a morphological index that quantifies how closely the shape of an object approximates a perfect circle (value =1). Circularity was defined as 4π x Area/Perimeter2, where both the area and perimeter of each segmented object were obtained using ImageJ. This index ranges from 0 to 1, with values closer to 1 representing more circular objects and lower values correspond to elongated or irregular shapes (Chen et al, 2017).

      Chen, B., Y. Wang, S. Berretta and O. Ghita. 2017. Poly Aryl Ether Ketones (PAEKs) and carbon-reinforced PAEK powders for laser sintering. J Mater Sci 52:6004-6019.

      Reviewer #1 (Significance (Required)):

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      See our reply above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Ono et al use a variety of imaging and genetic (AID) depletion approaches to examine the roles of condensin II and cohesin in the reformation of interphase genome architecture in human HCT16 cells. Consistent with previous literature, they find that condensin II is required for CENP-A dispersion in late mitosis/early G1. Using in situ FISH at the centromere/q arm of chromosome 12 they then establish that condensin II removal causes lengthwise elongation of chromosomes that, interestingly, can be suppressed by cohesin removal. To better understand changes in whole-chromosome morphology, they then use whole chromosome painting to examine chromosomes 18 and 19. In the absence of condensin II, cells effectively fail to reorganise their chromosomes from rod-like structures into spherical chromosome territories (which may explain why CENP-A dispersion is suppressed). Cohesin is not required for spherical CT formation, suggesting condensin II is the major initial driver of interphase genome structure. Double depletion results in complete disorganisation of chromatin, leading the authors to conclude that a typical cell cycle requires orderly 'handover' from the mitotic to interphase genome organising machinery. The authors then move on to G2 phase, where they use a variety of different FISH probes to assess alterations in chromosome structure at different scales. They thereby establish that perturbation of cohesin or condensin II influences local and longer range chromosome structure, respectively. The effects of condensin II depletion become apparent at a genomic distance of 20 Mb, but are negligible either below or above. The authors repeat the G1 depletion experiment in G2 and now find that condensin II and cohesin are individually dispensable for CT organisation, but that dual depletion causes CT collapse. This rather implies that there is cooperation rather than handover per se. Overall this study is a broadly informative multiscale investigation of the roles of SMC complexes in organising the genome of postmitotic cells, and solidifies a potential relationship between condensin II and cohesin in coordinating interphase genome structure. The deeper investigation of the roles of condensin II in establishing chromosome territories and intermediate range chromosome structure in particular is a valuable and important contribution, especially given our incomplete understanding of what functions this complex performs during interphase.

      We sincerely appreciate the reviewer’s supportive comments. The reviewer has correctly acknowledged both the current gaps in our understanding of the role of condensin II in interphase chromosome organization and our new findings on the collaborative roles of condensin II and cohesin in establishing and maintaining interphase chromosome territories.

      Major comments:

      In general the claims and conclusions of the manuscript are well supported by multiscale FISH labelling. An important absent control is western blotting to confirm protein depletion levels. Currently only fluorescence is used as a readout for the efficiency of the AID depletion, and we know from prior literature that even small residual quantities of SMC complexes are quite effective in organising chromatin. I would consider a western blot a fairly straightforward and important technical control.

      Let me explain why we used immunofluorescence measurements to evaluate the efficiency of depletion. In our current protocol for synchronizing at the M-to-G1 transition, ~60% of control and H2-depleted cells, and ~30% of Rad21-depleted and co-depleted cells, are successfully synchronized in G1 phase. The apparently lower synchronization efficiency in the latter two groups is attributable to the well-documented mitotic delay caused by cohesin depletion. From these synchronized populations, early G1 cells were selected based on their characteristic morphologies (see the legend of Fig. 1C). In this way, we analyzed an early G1 cell population that had completed mitosis without chromosome segregation defects. We acknowledge that this represents a technically challenging aspect of M-to-G1 synchronization in HCT116 cells, whose synchronization efficiency is limited compared with that of HeLa cells. Nevertheless, this approach constitutes the most practical strategy currently available. Hence, immunofluorescence provides the only feasible means to evaluate depletion efficiency under these conditions.

      Although immunoblotting can, in principle, be applied to G2-arrested cell populations, we do not believe that information obtained from such experiments would affect the main conclusions of the current study. Please note that we carefully designed and performed all experiments with appropriate controls: H2 depletion, RAD21 depletion, and double depletion, with outcomes confirmed using independent cell lines (Double-AID#2 and Double-AID#3) whenever deemed necessary.

      We fully acknowledge the technical limitations associated with the AID-mediated depletion techniques, which are now described in the section entitled “Limitations of the study” at the end of the Discussion. Nevertheless, we emphasize that these limitations do not compromise the validity of our findings.

      I find the point on handover as a mechanism for maintaining CT architecture somewhat ambiguous, because the authors find that the dependence simply switches from condensin II to both condensin II and cohesin, between G1 and G2. To me this implies augmented cooperation rather than handover. I have two further suggestions, both of which I would strongly recommend but would consider desirable but 'optional' according to review commons guidelines.

      First of all, we would like to clarify a possible misunderstanding regarding the phrase “handover as a mechanism for maintaining CT architecture somewhat ambiguous”. In the original manuscript, we proposed handover as a mechanism for establishing G1 chromosome territories, not for maintaining CTs.

      That said, we take this comment very seriously, especially because Reviewer #1 also expressed the same concern. Please see our reply to Reviewer #1 (Major point).

      In brief, we agree with the reviewer that the word “handover” may not be appropriate to describe the functional relationship between condensin II and cohesin during the M-to-G1 transition. In the revised manuscript, we have avoided the use of the word “handover”, replacing it with “interplay”. It should be emphasized, however, that given their distinct chromosome-binding kinetics, the cooperation of the two SMC complexes during the M-to-G1 transition is qualitatively different from that observed in G2. Therefore, the central conclusion of the present study remains unchanged.

      For example, a sentence in Abstract has been changed as follows:

      a functional interplay between condensin II and cohesin during the mitosis-to-G1 transition is critical for establishing chromosome territories (CTs) in the newly assembling nucleus.

      Firstly, the depletions are performed at different stages of the cell cycle but have different outcomes. The authors suggest this is because handover is already complete, but an alternative possibility is that the phenotype is masked by other changes in chromosome structure (e.g. duplication/catenation). I would be very curious to see, for example, how the outcome of this experiment would change if the authors were to repeat the depletions in the presence of a topoisomerase II inhibitor.

      The reviewer’s suggestion here is somewhat vague, and it is unclear to us what rationale underlies the proposed experiment or what meaningful outcomes could be anticipated. Does the reviewer suggest that we perform topo II inhibitor experiments both during the M-to-G1 transition and in G2 phase, and then compare the outcomes between the two conditions?

      For the M-to-G1 transition, Hildebrand et at (2024) have already reported such experiments. They used a topo II inhibitor to provided evidence that mitotic chromatids are self-entangled and that the removal of these mitotic entanglements is required to establish a normal interphase nucleus. Our own preliminary experiments (not presented in the current manuscript) showed that ICRF treatment of cells undergoing the M-to-G1 transition did not affect post-mitotic centromere dispersion. The same treatment also had little effect on the suppression of centromere dispersion observed in condensin II-depleted cells.

      Under G2-arrested condition, because chromosome territories are largely individualized, we would expect topo II inhibition to affect only the extent of sister catenation, which is not the focus of our current study. We anticipate that inhibiting topo II in G2 would have only a marginal, if any, effect on the maintenance of chromosome territories detectable by our current FISH approaches.

      In any case, we consider the suggested experiment to be beyond the scope of the present manuscript, which focuses on the collaborative roles of condensin II and cohesin as revealed by multi-scale FISH analyses.

      Secondly, if the author's claim of handover is correct then one (not exclusive) possibility is that there is a relationship between condensin II and cohesin loading onto chromatin. There does seem to be a modest co-dependence (e.g. fig S4 and S7), could the authors comment on this?

      First of all, we wish to point out the reviewer’s confusion between the G2 experiments and the M-to-G1 experiments. Figs. S4 and S7 concern experiments using G2-arrested cells, not M-to-G1 cells in which a possible handover mechanism is discussed. Based on Fig. 1, in which the extent of depletion in M-to-G1 cells was tested, no evidence of “co-dependence” between H2 depletion and RAD21 depletion was observed.

      That said, as the reviewer correctly points out, we acknowledge the presence of marginal yet statistically significant reductions in the RAD21 signal upon H2 depletion (and vice versa) in G2-arrested cells (Figs. S4 and S7).

      Another control experiment here would be to treat fully WT cells with IAA and test whether non-AID labelled H2 or RAD21 dip in intensity. If they do not, then perhaps there's a causal relationship between condensin II and cohesin levels?

      According to the reviewer’s suggestion, we tested whether IAA treatment causes an unintentional decreases in the H2 or RAD21 signals in G2-arrested cells, and found that it is not the case (see the attached figure below).

      Thus, these data indicate that there is a modest functional interdependence between condensin II and cohesin in G2-arrested cells. For instance, condensin II depletion may modestly destabilize chromatin-bound cohesin (and vice versa). However, we note that these effects are minor and do not affect the overall conclusions of the study. In the revised manuscript, we have described these potentially interesting observations briefly as a note in the corresponding figure legends (Fig. S4).

      I recognise this is something considered in Brunner et al 2025 (JCB), but in their case they depleted SMC4 (so all condensins are lost or at least dismantled). Might bear further investigation.

      Methods:

      Data and methods are described in reasonable detail, and a decent number of replicates/statistical analyses have been. Documentation of the cell lines used could be improved. The actual cell line is not mentioned once in the manuscript. Although it is referenced, I'd recommend including the identity of the cell line (HCT116) in the main text when the cells are introduced and also in the relevant supplementary tables. Will make it easier for readers to contextualise the findings.

      We apologize for the omission of important information regarding the parental cell line used in the current study. The information has been added to Materials and Methods as well as the resource table.

      Minor comments:

      Overall the manuscript is well-written and well presented. In the introduction it is suggested that no experiment has established a causal relationship between human condensin II and chromosome territories, but this is not correct, Hoencamp et al 2021 (cell) observed loss of CTs after condensin II depletion. Although that manuscript did not investigate it in as much detail as the present study, the fundamental relationship was previously established, so I would encourage the authors to revise this statement.

      We are somewhat puzzled by this comment. In the original manuscript, we explicitly cited Hoencamp et al (2021) in support of the following sentences:

      • *

      (Lines 78-83 in the original manuscript)

      *Moreover, high-throughput chromosome conformation capture (Hi-C) analysis revealed that, under such conditions, chromosomes retain a parallel arrangement of their arms, reminiscent of the so-called Rabl configuration (Hoencamp et al., 2021). These findings indicate that the loss or impairment of condensin II during mitosis results in defects in post-mitotic chromosome organization. *

      • *

      That said, to make the sentences even more precise, we have made the following revision in the manuscript.

      • *

      (Lines 78- 82 in the revised manuscript)

      *Moreover, high-throughput chromosome conformation capture (Hi-C) analysis revealed that, under such conditions, chromosomes retain a parallel arrangement of their arms, reminiscent of the so-called Rabl configuration (Hoencamp et al., 2021). These findings,together with cytological analyses of centromere distributions, indicate that the loss or impairment of condensin II during mitosis results in defects in post-mitotic chromosome organization. *

      • *

      The following statement was intended to explain our current understanding of the maintenance of chromosome territories. Because Hoencamp et al (2021) did not address the maintenance of CTs, we have kept this sentence unchanged.

      • *

      (Lines 100-102 in the original manuscript)

      Despite these findings, there is currently no evidence that either condensin II, cohesin, or their combined action contributes to the maintenance of CT morphology in mammalian interphase cells (Cremer et al., 2020).

      • *

      • *

      Reviewer #2 (Significance (Required)):

      General assessment:

      Strengths: the multiscale investigation of genome architecture at different stages of interphase allow the authors to present convincing and well-analysed data that provide meaningful insight into local and global chromosome organisation across different scales.

      Limitations:

      As suggested in major comments.

      Advance:

      Although the role of condensin II in generating chromosome territories, and the roles of cohesin in interphase genome architecture are established, the interplay of the complexes and the stage specific roles of condensin II have not been investigated in human cells to the level presented here. This study provides meaningful new insight in particular into the role of condensin II in global genome organisation during interphase, which is much less well understood compared to its participation in mitosis.

      Audience:

      Will contribute meaningfully and be of interest to the general community of researchers investigating genome organisation and function at all stages of the cell cycle. Primary audience will be cell biologists, geneticists and structural biochemists. Importance of genome organisation in cell/organismal biology is such that within this grouping it will probably be of general interest.

      My expertise is in genome organization by SMCs and chromosome segregation.

      We appreciate the reviewer’s supportive comments. As the reviewer fully acknowledges, this study is the first systematic survey of the collaborative role of condensin II and cohesin in establishing and maintaining interphase chromosome territories. In particular, multi-scale FISH analyses have enabled us to clarify how the two SMC protein complexes contribute to the maintenance of G2 chromosome territories through their actions at different genomic scales. As the reviewer notes, we believe that the current study will appeal to a broad readership in cell and chromosome biology. The limitations of the current study mentioned by the reviewer are addressed in our reply above.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript “Condensin II collaborates with cohesin to establish and maintain interphase chromosome territories" investigates how condensin II and cohesin contribute to chromosome organization during the M-to-G1 transition and in G2 phase using published auxin-inducible degron (AID) cell lines which render the respective protein complexes nonfunctional after auxin addition. In this study, a novel degron cell line was established that enables the simultaneous depletion of both protein complexes, thereby facilitating the investigation of synergistic effects between the two SMC proteins. The chromosome architecture is studied using fluorescence in situ hybridization (FISH) and light microscopy. The authors reproduce a number of already published data and also show that double depletion causes during the M-to-G1 transition defects on chromosome territories, producing expanded, irregular shapes that obscure condensin II-specific phenotypes. Findings in G2 cells point to a new role of condensin II for chromosome conformation at a scale of ~20Mb. Although individual depletion has minimal effects on large-scale CT morphology in G2, combined loss of both complexes produces marked structural abnormalities, including irregular crescent-shaped CTs displaced toward the nucleolus and increased nucleolus-CT contact. The authors propose that condensin II and cohesin act sequentially and complementarily to ensure proper post-mitotic CT formation and maintain chromosome architecture across genomic scales.

      We greatly appreciate the reviewer’s supportive comments. The reviewer has accurately recognized our new findings concerning the collaborative roles of condensin II and cohesin in the establishment and maintenance of interphase chromosome territories.

      Concenrs about statistics:

      • The authors provide the information on how many cells are analyzed but not the number of independent experiments. My concern is that there might variations in synchronization of the cell population and in the subsequent preparation (FISH) affecting the final result. We appreciate the reviewer’s important comment regarding the biological reproducibility of our experiments. As the reviewer correctly points out, variations in cell-cycle synchronization and FISH sample preparation can occur across experiments. To address this concern, we repeated the key experiments supporting our main conclusions (Figs. 3 and 6) two additional times, resulting in three independent biological replicas in total. All replicate experiments reproduced the major observations from the original analyses. These results further substantiated our original conclusion, despite the inevitable variability arising from cell synchronization or sample preparation in this type of experiments. In the revised manuscript, we have now explicitly indicated the number of biological replicates in the corresponding figures.

      The analyses of chromosome-arm conformation shown in Fig. 5 were already performed in three independent rounds of experiments, as noted in the original submission. In addition, similar results were already obtained in other analyses reported in the manuscript. For example, centromere dispersion was quantified using an alternative centromere detection method (related to Fig. 1), and distances between specific chromosomal sites were measured using different locus-specific probes (related to Figs. 2 and 4). In both cases, the results were consistent with those presented in the manuscript.

      • Statistically the authors analyze the effect of cells with induced degron vs. vehicle control (non-induced). However, the biologically relevant question is whether the data differ between cell lines when the degron system is induced. This is not tested here (cf. major concern 2 and 3). See our reply to major concerns 2 and 3.

      • Some Journal ask for blinded analysis of the data which might make sense here as manual steps are involved in the data analysis (e.g. line 626 / 627the convex hull of the signals was manually delineated, line 635 / 636 Chromosome segmentation in FISH images was performed using individual thresholding). However personally I have no doubts on the correctness of the work. We thank the reviewer for pointing out that some steps in our data analysis were performed manually, such as delineating the convex hull of signals and segmenting chromosomes in FISH and IF images using individual thresholds. These manual steps were necessary because signal intensities vary among cells and chromosomes, making fully automated segmentation unreliable. To ensure objectivity, we confirmed that the results were consistent across two independently established double-depletion cell lines, which produced essentially identical findings. In addition, we repeated the key experiments underpinning our main conclusions (Figs. 3 and 6) two additional times, and the results were fully consistent with the original analyses. Therefore, we are confident that our current data analysis approach does not compromise the validity of our conclusions. Finally, we appreciate the reviewer’s kind remark that there is no doubt regarding the correctness of our work.

      Major concerns:

      • Degron induction appears to delay in Rad21-AID#1 and Double-AID#1 cells the transition from M to G1, as shown in Fig. S1. After auxin treatment, more cells exhibit a G2 phenotype than in an untreated population. What are the implications of this for the interpretation of the experiments? In our protocol shown in Fig. 1C, cells were released into mitosis after G2 arrest, and IAA was added 30 min after release. It is well established that cohesin depletion causes a prometaphase delay due to spindle checkpoint activation (e.g., Vass et al, 2003, Curr Biol; Toyoda and Yanagida, 2006, MBoC; Peters et al, 2008, Genes Dev), which explains why cells with 4C DNA content accumulated, as judged by FACS (Fig. S1). The same was true for doubly depleted cells. However, a fraction of cells that escaped this delay progressed through mitosis and enter the G1 phase of the next cell cycle. We selected these early G1 cells and used them for down-stream analyses. This experimental procedure was explicitly described in the legends of Fig. 1C and Fig. S1A as follows:

      (Lines 934-937; Legend of Fig. 1C)

      From the synchronized populations, early G1cells were selected based on their characteristic morphologies (i.e., pairs of small post-mitotic cells) and subjected to downstream analyses. Based on the measured nuclear sizes (Fig. S2 G), we confirmed that early G1 cells were appropriately selected.

      (Lines 1114-1119; Legend of Fig. S1A)

      In this protocol, ~60% of control and H2-depleted cells, and ~30% of Rad21-depleted and co-depleted cells, were successfully synchronized in G1 phase. The apparently lower synchronization efficiency in the latter two groups is attributable to the well documented mitotic delay caused by cohesin depletion (Hauf et al., 2005; Haarhuis et al., 2013; Perea-Resa et al., 2020). From these synchronized populations, early G1 cells were selected based on their characteristic morphologies (see the legend of Fig. 1 C).

      • *

      Thus, using this protocol, we analyzed an early G1 cell population that had completed mitosis without chromosome segregation defects. We acknowledge that this represents a technically challenging aspect of synchronizing cell-cycle progression from M to G1 in HCT116 cells, whose synchronization efficiency is limited compared with that of HeLa cells. Nevertheless, this approach constitutes the most practical strategy currently available.

      • Line 178 "In contrast, cohesin depletion had a smaller effect on the distance between the two site-specific probes compared to condensin II depletion (Fig. 2, C and E)." The data in Fig. 2 E show both a significant effect of H2 and a significant effect of RAD21 depletion. Whether the absolute difference in effect size between the two conditions is truly relevant is difficult to determine, as the distribution of the respective control groups also appears to be different. This comment is well taken. Reviewer #1 has made a comment on the same issue. See our reply to Reviewer #1 (Other points, Figure 2E).

      In brief, in the current study, we should focus on the differences between -IAA and +IAA within each cell line, rather than comparing the -IAA conditions across different cell lines. In this sense, a sentence in the original manuscript (lines 178-180) was misleading. In the revised manuscript, we have modified the corresponding and subsequent sentence as follows:

      Although cohesin depletion had a marginal effect on the distance between the two site-specific probes (Fig.2, C and E), double depletion did not result in a significant change (Fig.2, D and E), consistent with the partial restoration of centromere dispersion (Fig. 1G).

      • In Figures 3, S3 and related text in the manuscript I cannot follow the authors' argumentation, as H2 depletion alone leads to a significant increase in the CT area (Chr. 18, Chr. 19, Chr. 15). Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion). Here, too, appropriate statistical tests or more suitable parameters describing the effect should be used. I also cannot fully follow the argumentation regarding chromosome elongation, as double depletion in Chr. 18 and Chr. 19 also leads to a significantly reduced circularity. Therefore, the schematic drawing Fig. 3 H (double depletion) seems very suggestive to me. This comment is related to the comment above (Major comment #2). See our reply to Reviewer #1 (Other points, Figure 2E).

      It should be noted that, in Figure 3 (unlike in Figure 2), we did not compare the different magnitudes of the effect observed between H2 depletion and double depletion. Thus, the reviewer’s comment that “Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion) ” does not accurately reflected our description.

      Moreover, while the distance between two specific loci (Fig. 2E) and CT circularity (Fig. 3G) are intuitively related, they represent distinct parameters. Thus, it is not unexpected that double depletion resulted in apparently different outcomes for the two measurements. Thus, the reviewer’s counter-argument is not strictly applicable here.

      That said, we agree with the reviewer that our descriptions here need to be clarified.

      The differences between H2 depletion and double depletion are two-fold: (1) centromere dispersion is suppressed upon H2 depletion, but not upon double depletion (Fig 1G); (2) the distance between Cen 12 and 12q15 increased upon H2 depletion, but not upon double depletion (Fig 2E).

      We have decided to remove the “homologous pair overlap” panel (formerly Fig. 3E) from the revised manuscript. Accordingly, the corresponding sentence has been deleted from the main text. Instead, we have added a new panel of “aspect ratio”, defined as the ratio of the major to the minor axis (new Fig. 3F). While this intuitive parameter was altered upon condensin II depletion and double depletion, again, we acknowledge that it is not sufficient to convincingly distinguish between the elongated and cloud-like phenotypes proposed in the original manuscript. For these reasons, in the revised manuscript, we have toned down our statements regarding the differences in CT morphology between the two conditions. Nonetheless, together with the data from Figs. 1 and 2, it is clear that the Rabl configuration observed upon condensin II depletion is further exacerbated in the absence of cohesin. Accordingly, we have modified the main text and the cartoon (Fig 3H) to more accurately depict the observations summarized above.

      • 5 and accompanying text. I agree with the authors that this is a significant and very interesting effect. However, I believe the sharp bends is in most cases an artifact caused by the maximum intensity projection. I tried to illustrate this effect in two photographs: Reviewer Fig. 1, side view, and Reviewer Fig. 2, same situation top view (https://cloud.bio.lmu.de/index.php/s/77npeEK84towzJZ). As I said, in my opinion, there is a significant and important effect; the authors should simply adjust the description. This comment is well taken. We appreciate the reviewer’s effort to help clarify our original observations. We have therefore added a new section entitled “Limitations of the study” to explicitly describe the constrains of our current approach. That said, as the reviewer also acknowledges, our observations remain valid because all experiments were performed with appropriate controls.

      Minor concerns:

      • I would like to suggest proactively discussing possible artifacts that may arise from the harsh conditions during FISH sample preparation. We fully agree with the reviewer’s concerns. For FISH sample preparation, we used relatively harsh conditions, including (1) fixation under a hypotonic condition (0.3x PBS), (2) HCl treatment, and (3) a denaturation step. We recognize that these procedures inevitably affect the preservation of the original structure; however, they are unavoidable in the standard FISH protocol. We also acknowledge that our analyses were limited to 2D structures based on projected images, rather than full 3D reconstructions. These technical limitations are now explicitly described in a new section entitled “Limitations of the study”, and the technical details are provided in Materials and Methods.

      • It would be helpful if the authors could provide the original data (microscopic image stacks) for download. We thank the reviewer for this suggestion and understand that providing the original image stacks could be of interest to readers. We agree that if the nuclei were perfectly spherical, as is the case for example in lymphocytes, 3D image stacks would contain much more information than 2D projections. However, as is typical for adherent cultured cells, including the HCT116-derived cells used in this study, the nuclei are flattened due to cell adhesion to the culture dish, with a thickness of only about one-tenth of the nuclear diameter (10–20 μm). Considering also the inevitable loss of structural preservation during FISH sample preparation, we were concerned that presenting 3D images might confuse rather than clarify. We therefore believe that representing the data as 2D projections, while explicitly acknowledging the technical limitations, provides the clearest and most interpretable presentation of our results. These limitations are now described in a new section of the manuscript.

      • The authors use a blind deconvolution algorithm to improve image quality. It might be helpful to test other methods for this purpose (optional). We thank the reviewer for this valuable suggestion and fully agree that it is a valid point. We recognize that alternative image enhancement methods can offer advantages, particularly for smaller structures or when multiple probes are analyzed simultaneously. In our study, however, the focus was on detecting whole chromosome territories (CTs) and specific chromosomal loci, which can be visualized clearly with our current FISH protocol combined with blind deconvolution. We therefore believe that the image quality we obtained is sufficient to support the conclusions of this manuscript.

      Reviewer #3 (Significance (Required)):

      Advance:

      Ono et al. addresses the important question on how the complex pattern of chromatin is reestablished after mitosis and maintained during interphase. In addition to affinity interactions (1,2), it is known that cohesin plays an important role in the formation and maintenance of chromosome organization interphase (3). However, current knowledge does not explain all known phenomena. Even with complete loss of cohesin, TAD-like structures can be recognized at the single-cell level (4), and higher structures such as chromosome territories are also retained (5). The function of condensin II during mitosis is another important factor that affects chromosome architecture in the following G1 phase (6). Although condensin II is present in the cell nucleus throughout interphase, very little is known about the role of this protein in this phase of the cell cycle. This is where the present publication comes in, with a new double degron cell line in which essential subunits of cohesin AND condensin can be degraded in a targeted manner. I find the data from the experiments in the G2 phase most interesting, as they suggest a previously unknown involvement of condensin II in the maintenance of larger chromatin structures such as chromosome territories.

      The experiments regarding the M-G1 transition are less interesting to me, as it is known that condensin II deficiency in mitosis leads to elongated chromosomes (Rabl configuration)(6), and therefore the double degradation of condensin II and cohesin describes the effects of cohesin on an artificially disturbed chromosome structure.

      For further clarification, we provide below a table summarizing previous studies relevant to the present work. We wish to emphasize three novel aspects of the present study. First, newly established cell lines designed for double depletion enabled us to address questions that had remained inaccessible in earlier studies. Second, to our knowledge, no study has previously reported condensin II depletion, cohesin depletion and double depletion in G2-arrested cells. Third, the present study represents the first systematic comparison of two different stages of the cell cycle using multiscale FISH under distinct depletion conditions. Although the M-to-G1 part of the present study partially overlaps with previous work, it serves as an important prelude to the subsequent investigations. We are confident that the reviewer will also acknowledge this point.

      cell cycle

      cond II depletion

      cohesin depletion

      double depletion

      M-to-G1

      Hoencamp et al (2021); Abramo et al (2019); Brunner et al (2025);

      this study

      Schwarzer et al (2017);

      Wutz et al (2017);

      this study

      this study

      G2

      this study

      this study

      this study

      Hoencamp et al (2021): Hi-C and imaging (CENP-A distribution)

      Abramo et al (2019): Hi-C and imaging

      Brunner et al (2025): mostly imaging (chromatin tracing)

      Schwarzer et al (2017); Wutz et al (2017): Hi-C

      this study: imaging (multi-scale FISH)

      General limitations:

      (1) Single cell imaging of chromatin structure typically shows only minor effects which are often obscured by the high (biological) variability. This holds also true for the current manuscript (cf. major concern 2 and 3).

      See our reply above.

      (2) A common concern are artefacts introduced by the harsh conditions of conventional FISH protocols (7). The authors use a method in which the cells are completely dehydrated, which probably leads to shrinking artifacts. However, differences between samples stained using the same FISH protocol are most likely due to experimental variation and not an artefact (cf. minor concern 1).

      See our reply above.

      • The anisotropic optical resolution (x-, y- vs. z-) of widefield microscopy (and most other light microscopic techniques) might lead to misinterpretation of the imaged 3D structures. This seems to be the cases in the current study (cf. major concern 4). See our reply above.

      • In the present study, the cell cycle was synchronized. This requires the use of inhibitors such as the CDK1 inhibitor RO-3306. However, CDK1 has many very different functions (8), so unexpected effects on the experiments cannot be ruled out. The current approaches involving FISH inevitably require cell cycle synchronization. We believe that the use of the CDK1 inhibitor RO-3306 to arrest the cell cycle at G2 is a reasonable choice, although we cannot rule out unexpected effects arising from the use of the drug. This issue has now been addressed in the new section entitled “Limitations of the study”.

      Audience:

      The spatial arrangement of genomic elements in the nucleus and their (temporal) dynamics are of high general relevance, as they are important for answering fundamental questions, for example, in epigenetics or tumor biology (9,10). The manuscript from Ono et al. addresses specific questions, so its intended readership is more likely to be specialists in the field.

      We are confident that, given the increasing interest in the 3D genome and its role in regulating diverse biological functions, the current manuscript will attract the broad readership of leading journals in cell biology.

      About the reviewer:

      By training I'm a biologist with strong background in fluorescence microscopy and fluorescence in situ hybridization. In recent years, I have been involved in research on the 3D organization of the cell nucleus, chromatin organization, and promoter-enhancer interactions.

      We greatly appreciate the reviewer’s constructive comments on both the technical strengths and limitations of our fluorescence imaging approaches, which have been very helpful in revising the manuscript. As mentioned above, we have decided to add a special paragraph entitled “Limitations of the study” at the end of the Discussion section to discuss these issues.

      All questions regarding the statistics of angularly distributed data are beyond my expertise. The authors do not correct their statistical analyses for "multiple testing". Whether this is necessary, I cannot judge.

      We thank the reviewer for raising this important point. In our study, the primary comparisons were made between -IAA and +IAA conditions within the same cell line. Accordingly, the figures report P-values for these pairwise comparisons.

      For the distance measurements, statistical evaluations were performed in PRISM using ANOVA (Kruskal–Wallis test), and the P-values shown in the figures are based on these analyses (Fig. 1, G and H; Fig. 2 E; Fig. 3 F and G; Fig. 4 F; Fig. 6 F [right]–H; Fig. S2 B and G; Fig. S3 D and H; Fig. S5 A [right] and B [right]; Fig. S8 B). While the manuscript focuses on pairwise comparisons between -IAA and +IAA conditions within the same cell line, we also considered potential differences across cell lines as part of the same ANOVA framework, thereby ensuring that multiple testing was properly addressed. Because cell line differences are not the focus of the present study, the corresponding results are not shown.

      For the angular distribution analyses, we compared -IAA and +IAA conditions within the same cell line using the Mardia–Watson–Wheeler test; these analyses do not involve multiple testing (circular scatter plots; Fig. 5 C–E and Fig. S6 B, C, and E–H). In addition, to determine whether angular distributions exhibited directional bias under each condition, we applied the Rayleigh test to each dataset individually (Fig. 5 F and Fig. S6 I). As these tests were performed on a single condition, they are also not subject to the problem of multiple testing. Collectively, we consider that the statistical analyses presented in our manuscript appropriately account for potential multiple testing issues, and we remain confident in the robustness of the results.

      Literature

      Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, G., Solovei, I. et al. (2019) Heterochromatin drives compartmentalization of inverted and conventional nuclei. Nature, 570, 395-399. Mirny, L.A., Imakaev, M. and Abdennur, N. (2019) Two major mechanisms of chromosome organization. Curr Opin Cell Biol, 58, 142-152. Rao, S.S.P., Huang, S.C., Glenn St Hilaire, B., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, S.E., Bascom, G.D., Bochkov, I.D. et al. (2017) Cohesin Loss Eliminates All Loop Domains. Cell, 171, 305-320 e324. Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N. and Zhuang, X. (2018) Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science, 362. Cremer, M., Brandstetter, K., Maiser, A., Rao, S.S.P., Schmid, V.J., Guirao-Ortiz, M., Mitra, N., Mamberti, S., Klein, K.N., Gilbert, D.M. et al. (2020) Cohesin depleted cells rebuild functional nuclear compartments after endomitosis. Nat Commun, 11, 6146. Hoencamp, C., Dudchenko, O., Elbatsh, A.M.O., Brahmachari, S., Raaijmakers, J.A., van Schaik, T., Sedeno Cacciatore, A., Contessoto, V.G., van Heesbeen, R., van den Broek, B. et al. (2021) 3D genomics across the tree of life reveals condensin II as a determinant of architecture type. Science, 372, 984-989. Beckwith, K.S., Ødegård-Fougner, Ø., Morero, N.R., Barton, C., Schueder, F., Tang, W., Alexander, S., Peters, J.-M., Jungmann, R., Birney, E. et al. (2023) Nanoscale 3D DNA tracing in single human cells visualizes loop extrusion directly in situ. BioRxiv 8 of 9https://doi.org/10.1101/2021.04.12.439407. Massacci, G., Perfetto, L. and Sacco, F. (2023) The Cyclin-dependent kinase 1: more than a cell cycle regulator. Br J Cancer, 129, 1707-1716. Bonev, B. and Cavalli, G. (2016) Organization and function of the 3D genome. Nat Rev Genet, 17, 661-678. Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., Ren, B. et al. (2017) The 4D nucleome project. Nature, 549, 219-226.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript „Condensin II collaborates with cohesin to establish and maintain interphase chromosome territories" investigates how condensin II and cohesin contribute to chromosome organization during the M-to-G1 transition and in G2 phase using published auxin-inducible degron (AID) cell lines which render the respective protein complexes nonfunctional after auxin addition. In this study, a novel degron cell line was established that enables the simultaneous depletion of both protein complexes, thereby facilitating the investigation of synergistic effects between the two SMC proteins. The chromosome architecture is studied using fluorescence in situ hybridization (FISH) and light microscopy. The authors reproduce a number of already published data and also show that double depletion causes during the M-to-G1 transition defects on chromosome territories, producing expanded, irregular shapes that obscure condensin II-specific phenotypes. Findings in G2 cells point to a new role of condensin II for chromosome conformation at a scale of ~20Mb. Although individual depletion has minimal effects on large-scale CT morphology in G2, combined loss of both complexes produces marked structural abnormalities, including irregular crescent-shaped CTs displaced toward the nucleolus and increased nucleolus-CT contact. The authors propose that condensin II and cohesin act sequentially and complementarily to ensure proper post-mitotic CT formation and maintain chromosome architecture across genomic scales.

      Concerns about statistics:

      (1) The authors provide the information on how many cells are analyzed but not the number of independent experiments. My concern is that there might variations in synchronization of the cell population and in the subsequent preparation (FISH) affecting the final result.

      (2) Statistically the authors analyze the effect of cells with induced degron vs. vehicle control (non-induced). However, the biologically relevant question is whether the data differ between cell lines when the degron system is induced. This is not tested here (cf. major concern 2 and 3).

      (3) Some Journal ask for blinded analysis of the data which might make sense here as manual steps are involved in the data analysis (e.g. line 626 / 627the convex hull of the signals was manually delineated, line 635 / 636 Chromosome segmentation in FISH images was performed using individual thresholding). However personally I have no doubts on the correctness of the work.

      Major concerns:

      (1) Degron induction appears to delay in Rad21-AID#1 an Double-AID#1 cells the transition from M to G1, as shown in Fig. S1. After auxin treatment, more cells exhibit a G2 phenotype than in an untreated population. What are the implications of this for the interpretation of the experiments?

      (2) Line 178 "In contrast, cohesin depletion had a smaller effect on the distance between the two site-specific probes compared to condensin II depletion (Fig. 2, C and E)." The data in Fig. 2 E show both a significant effect of H2 and a significant effect of RAD21 depletion. Whether the absolute difference in effect size between the two conditions is truly relevant is difficult to determine, as the distribution of the respective control groups also appears to be different.

      (3) In Figures 3, S3 and related text in the manuscript I cannot follow the authors' argumentation, as H2 depletion alone leads to a significant increase in the CT area (Chr. 18, Chr. 19, Chr. 15). Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion). Here, too, appropriate statistical tests or more suitable parameters describing the effect should be used. I also cannot fully follow the argumentation regarding chromosome elongation, as double depletion in Chr. 18 and Chr. 19 also leads to a significantly reduced circularity. Therefore, the schematic drawing Fig. 3 H (double depletion) seems very suggestive to me.

      (4) Fig. 5 and accompanying text. I agree with the authors that this is a significant and very interesting effect. However, I believe the sharp bends is in most cases an artifact caused by the maximum intensity projection. I tried to illustrate this effect in two photographs: Reviewer Fig. 1, side view, and Reviewer Fig. 2, same situation top view (https://cloud.bio.lmu.de/index.php/s/77npeEK84towzJZ). As I said, in my opinion, there is a significant and important effect; the authors should simply adjust the description.

      Minor concerns:

      (1) I would like to suggest proactively discussing possible artifacts that may arise from the harsh conditions during FISH sample preparation..

      (2) It would be helpful if the authors could provide the original data (microscopic image stacks) for download

      (3) The authors use a blind deconvolution algorithm to improve image quality. It might be helpful to test other methods for this purpose (optional).

      Significance

      Advance:

      Ono et al. addresses the important question on how the complex pattern of chromatin is reestablished after mitosis and maintained during interphase. In addition to affinity interactions (1,2), it is known that cohesin plays an important role in the formation and maintenance of chromosome organization interphase (3). However, current knowledge does not explain all known phenomena. Even with complete loss of cohesin, TAD-like structures can be recognized at the single-cell level (4), and higher structures such as chromosome territories are also retained (5). The function of condensin II during mitosis is another important factor that affects chromosome architecture in the following G1 phase (6). Although condensin II is present in the cell nucleus throughout interphase, very little is known about the role of this protein in this phase of the cell cycle. This is where the present publication comes in, with a new double degron cell line in which essential subunits of cohesin AND condensin can be degraded in a targeted manner. I find the data from the experiments in the G2 phase most interesting, as they suggest a previously unknown involvement of condensin II in the maintenance of larger chromatin structures such as chromosome territories. The experiments regarding the M-G1 transition are less interesting to me, as it is known that condensin II deficiency in mitosis leads to elongated chromosomes (Rabl configuration)(6), and therefore the double degradation of condensin II and cohesin describes the effects of cohesin on an artificially disturbed chromosome structure.

      General limitations:

      (1) Single cell imaging of chromatin structure typically shows only minor effects which are often obscured by the high (biological) variability. This holds also true for the current manuscript (cf. major concern 2 and 3).

      (2) A common concern are artefacts introduced by the harsh conditions of conventional FISH protocols (7). The authors use a method in which the cells are completely dehydrated, which probably leads to shrinking artifacts. However, differences between samples stained using the same FISH protocol are most likely due to experimental variation and not an artefact (cf. minor concern 1).

      (3) The anisotropic optical resolution (x-, y- vs. z-) of widefield microscopy (and most other light microscopic techniques) might lead to misinterpretation of the imaged 3D structures. This seems to be the cases in the current study (cf. major concern 4).

      (4) In the present study, the cell cycle was synchronized. This requires the use of inhibitors such as the CDK1 inhibitor RO-3306. However, CDK1 has many very different functions (8), so unexpected effects on the experiments cannot be ruled out.

      Audience:

      The spatial arrangement of genomic elements in the nucleus and their (temporal) dynamics are of high general relevance, as they are important for answering fundamental questions, for example, in epigenetics or tumor biology (9,10). The manuscript from Ono et al. addresses specific questions, so its intended readership is more likely to be specialists in the field.

      About the reviewer: By training I'm a biologist with strong background in fluorescence microscopy and fluorescence in situ hybridization. In recent years, I have been involved in research on the 3D organization of the cell nucleus, chromatin organization, and promoter-enhancer interactions.

      All questions regarding the statistics of angularly distributed data are beyond my expertise. The authors do not correct their statistical analyses for "multiple testing". Whether this is necessary, I cannot judge.

      Literature

      1. Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, G., Solovei, I. et al. (2019) Heterochromatin drives compartmentalization of inverted and conventional nuclei. Nature, 570, 395-399.
      2. Mirny, L.A., Imakaev, M. and Abdennur, N. (2019) Two major mechanisms of chromosome organization. Curr Opin Cell Biol, 58, 142-152.
      3. Rao, S.S.P., Huang, S.C., Glenn St Hilaire, B., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, S.E., Bascom, G.D., Bochkov, I.D. et al. (2017) Cohesin Loss Eliminates All Loop Domains. Cell, 171, 305-320 e324.
      4. Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N. and Zhuang, X. (2018) Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science, 362.
      5. Cremer, M., Brandstetter, K., Maiser, A., Rao, S.S.P., Schmid, V.J., Guirao-Ortiz, M., Mitra, N., Mamberti, S., Klein, K.N., Gilbert, D.M. et al. (2020) Cohesin depleted cells rebuild functional nuclear compartments after endomitosis. Nat Commun, 11, 6146.
      6. Hoencamp, C., Dudchenko, O., Elbatsh, A.M.O., Brahmachari, S., Raaijmakers, J.A., van Schaik, T., Sedeno Cacciatore, A., Contessoto, V.G., van Heesbeen, R., van den Broek, B. et al. (2021) 3D genomics across the tree of life reveals condensin II as a determinant of architecture type. Science, 372, 984-989.
      7. Beckwith, K.S., Ødegård-Fougner, Ø., Morero, N.R., Barton, C., Schueder, F., Tang, W., Alexander, S., Peters, J.-M., Jungmann, R., Birney, E. et al. (2023) Nanoscale 3D DNA tracing in single human cells visualizes loop extrusion directly in situ. BioRxiv https://doi.org/10.1101/2021.04.12.439407.
      8. Massacci, G., Perfetto, L. and Sacco, F. (2023) The Cyclin-dependent kinase 1: more than a cell cycle regulator. Br J Cancer, 129, 1707-1716.
      9. Bonev, B. and Cavalli, G. (2016) Organization and function of the 3D genome. Nat Rev Genet, 17, 661-678.
      10. Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., Ren, B. et al. (2017) The 4D nucleome project. Nature, 549, 219-226.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      • Ono et al use a variety of imaging and genetic (AID) depletion approaches to examine the roles of condensin II and cohesin in the reformation of interphase genome architecture in human HCT16 cells. Consistent with previous literature, they find that condensin II is required for CENP-A dispersion in late mitosis/early G1. Using in situ FISH at the centromere/q arm of chromosome 12 they then establish that condensin II removal causes lengthwise elongation of chromosomes that, interestingly, can be suppressed by cohesin removal. To better understand changes in whole-chromosome morphology, they then use whole chromosome painting to examine chromosomes 18 and 19. In the absence of condensin II, cells effectively fail to reorganise their chromosomes from rod-like structures into spherical chromosome territories (which may explain why CENP-A dispersion is suppressed). Cohesin is not required for spherical CT formation, suggesting condensin II is the major initial driver of interphase genome structure. Double depletion results in complete disorganisation of chromatin, leading the authors to conclude that a typical cell cycle requires orderly 'handover' from the mitotic to interphase genome organising machinery.

      • The authors then move on to G2 phase, where they use a variety of different FISH probes to assess alterations in chromosome structure at different scales. They thereby establish that perturbation of cohesin or condensin II influences local and longer range chromosome structure, respectively. The effects of condensin II depletion become apparent at a genomic distance of 20 Mb, but are negligible either below or above. The authors repeat the G1 depletion experiment in G2 and now find that condensin II and cohesin are individually dispensable for CT organisation, but that dual depletion causes CT collapse. This rather implies that there is cooperation rather than handover per se.

      • Overall this study is a broadly informative multiscale investigation of the roles of SMC complexes in organising the genome of postmitotic cells, and solidifies a potential relationship between condensin II and cohesin in coordinating interphase genome structure. The deeper investigation of the roles of condensin II in establishing chromosome territories and intermediate range chromosome structure in particular is a valuable and important contribution, especially given our incomplete understanding of what functions this complex performs during interphase.

      Major comments:

      • In general the claims and conclusions of the manuscript are well supported by multiscale FISH labelling. An important absent control is western blotting to confirm protein depletion levels. Currently only fluorescence is used as a readout for the efficiency of the AID depletion, and we know from prior literature that even small residual quantities of SMC complexes are quite effective in organising chromatin. I would consider a western blot a fairly straightforward and important technical control.

      • I find the point on handover as a mechanism for maintaining CT architecture somewhat ambiguous, because the authors find that the dependence simply switches from condensin II to both condensin II and cohesin, between G1 and G2. To me this implies augmented cooperation rather than handover.

      • I have two further suggestions, both of which I would strongly recommend but would consider desirable but 'optional' according to review commons guidelines.

      Firstly, the depletions are performed at different stages of the cell cycle but have different outcomes. The authors suggest this is because handover is already complete, but an alternative possibility is that the phenotype is masked by other changes in chromosome structure (e.g. duplication/catenation). I would be very curious to see, for example, how the outcome of this experiment would change if the authors were to repeat the depletions in the presence of a topoisomerase II inhibitor.

      Secondly, if the author's claim of handover is correct then one (not exclusive) possibility is that there is a relationship between condensin II and cohesin loading onto chromatin. There does seem to be a modest co-dependence (e.g. fig S4 and S7), could the authors comment on this? Another control experiment here would be to treat fully WT cells with IAA and test whether non-AID labelled H2 or RAD21 dip in intensity. If they do not, then perhaps there's a causal relationship between condensin II and cohesin levels?

      • I recognise this is something considered in Brunner et al 2025 (JCB), but in their case they depleted SMC4 (so all condensins are lost or at least dismantled). Might bear further investigation.

      Methods:

      Data and methods are described in reasonable detail, and a decent number of replicates/statistical analyses have been. Documentation of the cell lines used could be improved. The actual cell line is not mentioned once in the manuscript. Although it is referenced, I'd recommend including the identity of the cell line (HCT116) in the main text when the cells are introduced and also in the relevant supplementary tables. Will make it easier for readers to contextualise the findings.

      Minor comments:

      Overall the manuscript is well-written and well presented. In the introduction it is suggested that no experiment has established a causal relationship between human condensin II and chromosome territories, but this is not correct, Hoencamp et al 2021 (cell) observed loss of CTs after condensin II depletion. Although that manuscript did not investigate it in as much detail as the present study, the fundamental relationship was previously established, so I would encourage the authors to revise this statement.

      Significance

      General assessment: Strengths: the multiscale investigation of genome architecture at different stages of interphase allow the authors to present convincing and well-analysed data that provide meaningful insight into local and global chromosome organisation across different scales. Limitations: As suggested in major comments.

      Advance: Although the role of condensin II in generating chromosome territories, and the roles of cohesin in interphase genome architecture are established, the interplay of the complexes and the stage specific roles of condensin II have not been investigated in human cells to the level presented here. This study provides meaningful new insight in particular into the role of condensin II in global genome organisation during interphase, which is much less well understood compared to its participation in mitosis.

      Audience: Will contribute meaningfully and be of interest to the general community of researchers investigating genome organisation and function at all stages of the cell cycle. Primary audience will be cell biologists, geneticists and structural biochemists. Importance of genome organisation in cell/organismal biology is such that within this grouping it will probably be of general interest.

      My expertise is in genome organization by SMCs and chromosome segregation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      Major point:

      They propose a functional 'handover' from condensin II to cohesin, for the organization of CTs at the M-to-G1 transition. However, the 'handover', i.e. difference in timing of executing their functions, was not experimentally substantiated. Ideally, they can deplete condensin II and cohesin at different times to prove the 'handover'. However, this would require the use of two different degron tags and go beyond the revision of this manuscript. At least, based on the literature, the authors should discuss why they think condensin II and cohesin should work at different timings in the CT organization.

      Other points:

      • Figure 2E: It seems that the chromosome length without IAA is shorter in Rad21-aid cells than H2-aid cells or H2-aid Rad21-aid cells. How can this be interpreted?

      • Figure 3: Regarding the CT morphology, could they explain further the difference between 'elongated' and 'cloud-like (expanded)'? Is it possible to quantify the frequency of these morphologies?

      • Figure 5: How did they assign C, P and D3 for two chromosomes? The assignment seems obvious in some cases, but not in other cases (e.g. in the image of H2-AID#2 +IAA, two D3s can be connected to two Ps in the other way). They may have avoided line crossing between two C-P-D3 assignments, but can this be justified when the CT might be disorganized e.g. by condensin II depletion?

      • Figure 6F: The mean is not indicated on the right-hand side graph, in contrast to other similar graphs. Is this an error?

      • Figure S1A: The two FACS profiles for Double-AID #3 Release-2 may be mixed up between -IAA and +IAA.

      • The method section explains that 'circularity' shows 'how closely the shape of an object approximates a perfect circle (with a value of 1 indicating a perfect circle), calculated from the segmented regions'. It would be helpful to provide further methodological details about it.

      Significance

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): * Summary: * In this manuscript, Turner AH. et al. demonstrated the viral replication in cells depleting Rab11B small GTPase, which is a paralogue of Rab11A. It has been reported that Rab11A is responsible for the intracellular transport of viral RNP via recycling endosomes. The authors showed that Rab11B knockdown reduced the viral protein expression and viral titer. This may be caused by reduced attachment of viral particles on Rab11B knockdown cells.

      • Major comments:*
      • Comment 1 Fig 2-4: The authors should provide Western blot results with equal amount of loading control (GAPDH). The bands shown in these figures lack quantifiability and are not reliable as data.*

      We have rerun these western blots with more equal loading, and included a second loading control (beta-actin) in addition to the GAPDH. These blots can be seen in new Figures 2 and 3, and the quantification against both GAPDH (Figure 2/3) as well as actin (Fig S2) is now included. We have also included additional biological replicates for Fig 2 B-D. These additional experiments have strengthened our conclusion that Rab11B is required for efficient protein production in cells infected with recent H3N2, but not H1N1, isolates.

      Comment 2 Fig 2-4: Why are the results different between Rab11B knockdown alone and Rab11A/B double knockdown? If the authors claims are correct, the results of Rab11B knockdown should be reproducible in Rab11A/B double knockdown cells.

      Prior literature indicates that the Rab11A and Rab11B isoforms can play opposing roles in the trafficking of some cargos (ie, with one isoform transporting a molecule to the cell surface, while the other isoform takes it off again). In this scenario, it is possible that removing both 'halves' of the trafficking loop can ablate a phenotype. However, since our double knockdown used half the amount of siRNA for each isoform (for the same total amount), it is also possible this observation is simply the result of less efficient knockdown. In order to distinguish between these possibilities we depleted Rab11A or Rab11B individually, with this same 'half dose' of siRNA (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case, which is consistent with prior Rab11 literature.

      Comment 3 Fig 6: For better understanding, please provide a schematic illustration of experimental setting.

      We have added a new graphical overview to this figure (see new Figure 6A).

      Comment 4: It is necessary to test other siRNA sequences or perform a rescue experiment by expressing an siRNA-resistant clone in the knockdown cells. There seems to be an activation of host defense system, such as IFN pathways.

      In order to rule out the possibility of off-target effects we created a novel cell line that inducibly expresses a Rab11B shRNA sequence (see new Fig 4). This knockdown strategy used a completely different method (shRNA delivered by lentiviral vector vs transient transfection of siRNA), in a different cellular background (H441 "club like" cells vs A549 lung adenocarcinoma). This new depletion strategy showed that the Rab11B dependent H3N2 protein production phenotype is seen across multiple knockdown strategies and cellular backgrounds.

      **Referees cross-commenting**

      I agree with other reviewers' comments in part.

      Reviewer #1 (Significance (Required)):

      The authors propose a novel role for Rab11B in modulating attachment pathway of H3N2 influenza A virus by unknown mechanism. Although previous studies focus on the function of Rab11A on endocytic transport, the function and specificity of Rab11B has remained less clear. The findings may be of interest to a broad audience, including researchers in cell biology, immunology, and host-pathogen interactions. However, the study remains at a superficial level of analysis and does not lead to a deeper understanding of the underlying mechanisms.

      We agree with the reviewer that a strength of this manuscript is its multi-disciplinary nature, particularly with regard to advances in our understanding of Rab11B function. We have added a significant number of experiments and new figures to bolster the rigor and reproducibility of our findings. We have also added a new figure (Fig 7) that uses reverse genetics to map the Rab11B phenotype to the HA gene of the H3N2 isolate under study. By creating '7+1' reassortant viruses with the H3 HA or the N2 NA on a PR8 (H1N1) background (see Fig 7E-H) we were able to demonstrate that Rab11B is acting specifically on one of the HA-mediated entry steps. This provides additional mechanistic insight, by mapping the Rab11B-phenotype to a step at or prior to fusion. Fundamentally, we believe the novelty and rigor of our observation that recent H3N2 viruses enter through a different route than H1N1 isolates is worthy of observation in this updated form, so that the field can begin follow up studies.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Summary: The authors compare the effect of RAB11A and RAB11B knockdown on replication of contemporary H1N1 and H3N2 influenza A virus strains in A549 cells (human lung epithelials cells). They find a reduction in viral protein expression for tested H3N2 but not for H1N1 isolates. Mechanistically they suggest that RAB11A affects virion attachment to the cell surface.

      Major comments: The provided data do not conclusively support the suggested mechanism of action and essential controls are missing to substantiate the authors claims: • Knockdown efficacy has to be confirmed on protein level, showing reduced levels of RAB11A and B by Western blot. This is a standard in the field. Off target effects cannot be avoided by RNAi approaches and are usually ruled out by using multiple siRNAs or by complementing the targeted protein in trans.

      We have verified knockdown efficacy at the protein level in new Fig 1A/B. However, due to the high degree of protein level conservation between Rab11A and Rab11B it is very difficult to develop isoform specific antibodies, and we were unable to obtain a Rab11B-specific antibody that can detect endogenous protein (despite testing 6 commercially available antibodies for specificity). Using an antibody that detects both 11A and 11B (Fig1A) we were able to observe very slight changes in the molecular weight of the Rab11 band(s) detected upon knockdown of 11A vs 11B (suggestive of the two isoforms running as a dimer, with Rab11A the lower band and Rab11B the upper band). Cells depleted of both isoforms simultaneously showed a near complete loss of signal. Using a Rab11A antibody (that we confirmed as specific) we were able to observe loss of the Rab11A signal in both the 11A and 11A+B knockdowns (Fig 1B).

      • Viral titers should be presented as absolute titers not as % (here the labelling is actually misleading in all graphs indicating pfu/ml)

      This data is now shown in new Figure S1, where it is clear that the trends remain consistent across biological replicates. The axis labels of Fig 1D/E and Fig 3A have been corrected as requested to make clear we are normalizing to account for experiment-to-experiment variation in peak titer.

      • Reduction of viral protein expression goes hand in hand with a reduction in GAPDH. While this is accounted for in the quantification a general block of protein expression cannot be ruled out since the stability of house keeper proteins and viral proteins might be different. Testing multiple house keeping proteins could overcome this issue.

      We have included a second loading control (beta-actin) in addition to the GAPDH for new Figure 2 and 3. The quantification of viral protein production compared to beta actin is now included in new Fig S2. We have also included additional biological replicates for Fig 2 B-D. These additional experiments have strengthened our conclusion that Rab11B is required for efficient protein production in cells infected with recent H3N2, but not H1N1, isolates.

      • The FACS data in Fig 5 are not convincing. The previous figures showed modest reduction in viral protein expression and the fluorescence is indicated here on a logarithmic scale. Quantification and indication of mean fluorescence intensity from the same data would be a better readout to convincingly show that less cells are infected.

      We have reanalyzed the existing data to quantify the geometric mean of viral protein expression in the infected cell populations (new Figure 5D, E). This analysis shows no significant difference in geometric mean of HA (Fig 5D) or M2 (Fig 5E) expression between cells treated with NT, 11A or 11B siRNA. This additional analysis strengthens our original conclusion that when Rab11B is knocked down, fewer cells get infected, but those that do produce the same level of viral proteins.

      • During the time of addition experiment in Fig 6, the authors are testing for HA/M2 positive cells after 16h of infection. This is a multicycle scnario so in a second round they would measure the effect of knockdown in absence of amonium chloride. Shorter infections up to 8h with higher MOI would overcome this problem.

      By maintaining cells in ammonium chloride throughout the infection we are preventing endosomal acidification at any point in the infection period, so this experiment should be measuring solely the effect of one round of infection. The 16 hr timepoint was chosen to allow for optimized staining and analysis of samples by flow cytometry, within the available hours of the flow cytometry facility.

      • Standard error of mean is not an appropriate way of representing experimental error for the provided results and should be replaced by SD. Correct labeling of axis with units is required.

      We have updated the axes throughout the manuscript as requested. We have obtained additional statistical expertise (reflected in the updated author list) regarding the issue of SD vs SEM. Standard deviation (SD) would show a measure of the spread of the data, however the full distribution can be clearly seen as we plotted every individual data point. Standard error of the mean (SEM) is a measure of confidence for the mean of the population which takes into account SD and also sample size. SEM is not obvious to estimate by eye in the same way as SD, and we feel is more helpful to the reader to understand how likely the two population means differ from each other on a given graph.

      Minor comments: • The authors show a rescue of viral replication upon double knockdown of RAB11A and B. Maybe this is just a consequence of inefficient knockdown since only half of the siRNAs were used?

      In order to determine if this was the case we depleted Rab11A or Rab11B individually, with this same 'half dose' of siRNA (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case (ie, Rab11B transporting a molecule to the surface, while Rab11A recycles it off), which is consistent with prior Rab11 literature.

      • Specific experimental issues that are easily addressable. • Are prior studies referenced appropriately? • Are the text and figures clear and accurate? • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • Reviewer #2 (Significance (Required)): Significance The authors claim an H3N2 specific dependency on RAB11B for early steps of infection. While this is per se interesting the provided data do not fully support the claims and lack a mechanistic explanation. What is the difference between H1 and H3 strains (virion shape, HA load per virion, attachment force of H1 vs H3). The readouts used are not close enough to the events with regards to timing and could be supported by established entry assays in the field.

      We have provided additional discussion of the differences between H1s and H3s, including sialic acid binding preferences and changes in the HA-sialic acid avidity (lines 76-84). Notably, we have included a new assay (new Fig 7) that provides additional mechanistic insight into the observation that recent H3N2 but not H1N1 isolates depend on Rab11B early in infection. Using reverse genetics we were able to map the Rab11B phenotype to the HA gene of the H3N2 isolate under study. By creating '7+1' reassortant viruses with either the H3 HA or the N2 NA on a PR8 (H1N1) background (see Fig 7E) we are able to demonstrate that Rab11B is acting specifically at one of the HA-mediated entry steps. This excludes several non-HA dependent steps early in the life cycle (uncoating, RNP transport to the nucleus, nuclear import), thus providing additional confirmation that Rab11B acts at one of the earliest steps in the viral life cycle (and by definition, at or prior to fusion). Fundamentally, we believe the novelty and rigor of our observation that recent H3N2 viruses enter through a different route than H1N1 isolates is worthy of observation in this updated form, so that the field can begin follow up studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Manuscript Reference: RC-2025-03007 TITLE: Rab11B is required for binding and entry of recent H3N2, but not H1N1, influenza A isolates Allyson Turner, Sara Jaffrani, Hannah Kubinski, Deborah Ajayi, Matthew Owens, Madeline McTigue, Conor Fanuele, Cailey Appenzeller, Hannah Despres, Madaline Schmidt, Jessica Crothers, and Emily Bruce

      Summary Here, Turner et al. build upon existing knowledge of Influenza A virus (IAV) dependence on the Rab11 family of proteins and provide insights into the specific role of Rab11B isoform in H3N2 virus binding and entry. The introduction is clearly written and provides sufficient background on prior research involving Rab11. It effectively identifies the current gap in knowledge and justifies the investigation of more clinically relevant, circulating strains of IAV. The methods section provides sufficient detail to ensure reproducibility. Similarly, the discussion is well structured, aligns with the introduction, and thoughtfully outlines relevant follow-up experiments. The authors present data from a series of experiments which suggest that the reduced H3N2 infection and viral protein production in Rab11B-depleted cells is due to impaired virus binding. While the evidence supports a Rab11B-specific phenotype in the context of H3N2 infection, we recommend additional experiments (outlined below), to further validate and strengthen these findings. These would help solidify the mechanistic link between Rab11B depletion and the observed phenotype for H3N2 strains of IAV.

      Major comments Figure 1. (B) & (C) The authors normalise viral titers to the non-targeting control (NTC) siRNA set at 100. While this approach allows for relative comparisons, we recommend including the corresponding raw PFU/ml values, at least in the supplementary materials. This will better illustrate the biological significance of gene depletion and variability of the results.

      We have included the raw PFU/mL values in new Figure S1, while peak viral production varied by biological replicate (pasted below, with each biological replicate having a differently shaped data point). While the depletion-induced trends are clearly visible across biological replicates, normalization to average titer in the NT condition for each replicate allows for cleaner visualization.

      In addition, the current protocol uses a high MOI (1), and a relatively short infection period (16 hours) to capture single-cycle replication. However, to better assess the impact of gene knockdown on virus production and spread, we suggest performing a multicycle replication assay using a lower MOI (e.g, 0.01-0.001) over an extended time period, such as 48 hours before titration, provided that cell viability under these conditions is acceptable.

      We appreciate this suggestion and repeatedly attempted to carry out a multicycle growth curve to obtain this data. Unfortunately, out of four independent biological replicates we attempted, we were only able to maintain cell viability and adherence in one biological replicate (shown below). We have not included this data in the revised manuscript due to the limited replicates we were able to obtain, though we can add it in a further revision if the reviewer feels it is warranted.

      Figure 7. (B) & (C) The authors present interesting data showing that siRNA-mediated depletion of Rab11B reduces virion binding of a recently circulating strain of H3N2, but not H1N1, suggesting a subtype-specific role. However, we strongly recommend complementing this assay with a single-cell resolution approach such as immunofluorescence detection of surface-bound viruses through HA staining and image quantification. This would allow the authors to directly assess virion binding per cell and visualise the phenotype, strengthening the mechanistic insight on H3N2 binding in Rab11B-depleted cells. Furthermore, the data, particularly for H1N1 (Figure 7.C), shows substantial variance, which suggests a suboptimal assay sensitivity and limits the strength of the conclusion that the knockdown does not affect H1N1 binding, this limitation may be overcome by implementing the above experimental suggestion.

      We have made substantial efforts to include this data, but were ultimately unable to include this assay due to technical difficulties in implementation (NA stripping caused cells to lift off coverslips, difficulties in antibody sensitivity and specificity, among other issues). We also piloted single cell-based flow cytometry assays to attempt to measure signal from bound virions, but were unable to achieve sufficient differentiation between mock and bound samples with the antibodies we could obtain. However, we have included a new experimental approach that is able to genetically map the 11B-dependent phenotype to the HA gene, thus providing additional mechanistic insight and confirming that Rab11B acts on one of the earliest steps in the viral life cycle (prior to or at fusion).

      Minor comments General The authors should state which statistical test was used for each dataset in the respective figure legends.

      This information is now included in each figure legend.

      Figure 1. Suggest changing Y axis title to PFU/ml [relative to NTC]

      We have changed the axis titles of normalized data to "PFU as % of NT" throughout.

      The co-depletion of Rab11A and Rab11B appears to be less efficient than individual knockdowns, based on RT- qPCR data (Figure 1.A). It is possible that the partial 'rescue' phenotype observed in Figures 2-4 is due to incomplete knockdown, rather than a true biological interaction. This possibility should be acknowledged.

      In order to distinguish between a partial 'rescue' and inefficient knockdown, we depleted Rab11A or Rab11B individually, with the same 'half dose' of siRNA used in the double knockdown (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case, which is consistent with prior Rab11 literature, rather than simply inefficient knockdown.

      Furthermore, knockdown efficiency is assessed only at the mRNA level. To strengthen the conclusions, the authors are encouraged to provide western blot data confirming protein-level depletion of Rab11A and Rab11B, particularly in the double knockdown condition. This would help clarify whether co-transfection of siRNAs affect the efficiency of each individual knockdown at the protein level.

      We have verified knockdown efficacy at the protein level in new Fig 1A/B. However, due to the high degree of protein level conservation between Rab11A and Rab11B it is very difficult to develop isoform specific antibodies, and we were unable to obtain a Rab11B-specific antibody that can detect endogenous protein (despite testing 6 commercially available antibodies for specificity). Using an antibody that detects both 11A and 11B (Fig1A) we were able to observe very slight changes in the molecular weight of the Rab11 band(s) detected upon knockdown of 11A vs 11B (suggestive of the two isoforms running as a dimer, with Rab11A the lower band and Rab11B the upper band). Cells depleted of both isoforms simultaneously showed a near complete loss of signal. Using a Rab11A antibody (that we confirmed as specific) we were able to observe loss of the Rab11A signal in both the 11A and 11A+B knockdowns (Fig 1B).

      Figure 6. (A) & (B) are missing error bars, particularly the Rab11B knockdown data points.

      Error bars are plotted in each graph, but due to very limited experimental variation these error bars are too small to appear on the graph (11B points in Fig 6B, D).

      Figure 7. If including any repeats in the binding assay, authors are encouraged to use appropriate controls in each experiment such as exogenous neuraminidase treatment or sialidase treatment.

      When attempting to establish a microscopy based binding assay we included exogenous neuraminidase in each experiment. Unfortunately, the combination of glass coverslips and treatment with exogenous neuraminidase at incubation times sufficient to strip virus also removed cells from the coverslips.

      Reviewer #3 (Significance (Required)):

      General assessment: Provides a conceptual advancement of subtype specific receptor preferences.

      Advance: The study raises interesting observations regarding influenza virus subtype differences in cell surface receptor binding, in a Rab11B-dependent manner.

      Audience: Influenza virologists, respiratory virologists

      Expertise: Virus entry, Virus cell biology

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Title: Rab11B is required for binding and entry of recent H3N2, but not H1N1, influenza A isolates

      Allyson Turner, Sara Jaffrani, Hannah Kubinski, Deborah Ajayi, Matthew Owens, Madeline McTigue, Conor Fanuele, Cailey Appenzeller, Hannah Despres, Madaline Schmidt, Jessica Crothers, and Emily Bruce

      Summary

      Here, Turner et al. build upon existing knowledge of Influenza A virus (IAV) dependence on the Rab11 family of proteins and provide insights into the specific role of Rab11B isoform in H3N2 virus binding and entry. The introduction is clearly written and provides sufficient background on prior research involving Rab11. It effectively identifies the current gap in knowledge and justifies the investigation of more clinically relevant, circulating strains of IAV. The methods section provides sufficient detail to ensure reproducibility. Similarly, the discussion is well structured, aligns with the introduction, and thoughtfully outlines relevant follow-up experiments. The authors present data from a series of experiments which suggest that the reduced H3N2 infection and viral protein production in Rab11B-depleted cells is due to impaired virus binding. While the evidence supports a Rab11B-specific phenotype in the context of H3N2 infection, we recommend additional experiments (outlined below), to further validate and strengthen these findings. These would help solidify the mechanistic link between Rab11B depletion and the observed phenotype for H3N2 strains of IAV.

      Major comments

      Figure 1. (B) & (C)

      The authors normalise viral titers to the non-targeting control (NTC) siRNA set at 100. While this approach allows for relative comparisons, we recommend including the corresponding raw PFU/ml values, at least in the supplementary materials. This will better illustrate the biological significance of gene depletion and variability of the results. In addition, the current protocol uses a high MOI (1), and a relatively short infection period (16 hours) to capture single-cycle replication. However, to better assess the impact of gene knockdown on virus production and spread, we suggest performing a multicycle replication assay using a lower MOI (e.g, 0.01-0.001) over an extended time period, such as 48 hours before titration, provided that cell viability under these conditions is acceptable.

      Figure 7. (B) & (C)

      The authors present interesting data showing that siRNA-mediated depletion of Rab11B reduces virion binding of a recently circulating strain of H3N2, but not H1N1, suggesting a subtype-specific role. However, we strongly recommend complementing this assay with a single-cell resolution approach such as immunofluorescence detection of surface-bound viruses through HA staining and image quantification. This would allow the authors to directly assess virion binding per cell and visualise the phenotype, strengthening the mechanistic insight on H3N2 binding in Rab11B-depleted cells. Furthermore, the data, particularly for H1N1 (Figure 7.C), shows substantial variance, which suggests a suboptimal assay sensitivity and limits the strength of the conclusion that the knockdown does not affect H1N1 binding, this limitation may be overcome by implementing the above experimental suggestion.

      Minor comments

      General

      The authors should state which statistical test was used for each dataset in the respective figure legends.

      Figure 1.

      Suggest changing Y axis title to PFU/ml [relative to NTC] The co-depletion of Rab11A and Rab11B appears to be less efficient than individual knockdowns, based on RT- qPCR data (Figure 1.A). It is possible that the partial 'rescue' phenotype observed in Figures 2-4 is due to incomplete knockdown, rather than a true biological interaction. This possibility should be acknowledged. Furthermore, knockdown efficiency is assessed only at the mRNA level. To strengthen the conclusions, the authors are encouraged to provide western blot data confirming protein-level depletion of Rab11A and Rab11B, particularly in the double knockdown condition. This would help clarify whether co-transfection of siRNAs affect the efficiency of each individual knockdown at the protein level.

      Figure 6.

      (A) & (B) are missing error bars, particularly the Rab11B knockdown data points.

      Figure 7.

      If including any repeats in the binding assay, authors are encouraged to use appropriate controls in each experiment such as exogenous neuraminidase treatment or sialidase treatment.

      Significance

      General assessment: Provides a conceptual advancement of subtype specific receptor preferences.

      Advance: The study raises interesting observations regarding influenza virus subtype differences in cell surface receptor binding, in a Rab11B-dependent manner.

      Audience: Influenza virologists, respiratory virologists

      Expertise: Virus entry, Virus cell biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors compare the effect of RAB11A and RAB11B knockdown on replication of contemporary H1N1 and H3N2 influenza A virus strains in A549 cells (human lung epithelials cells). They find a reduction in viral protein expression for tested H3N2 but not for H1N1 isolates. Mechanistically they suggest that RAB11A affects virion attachment to the cell surface.

      Major comments:

      The provided data do not conclusively support the suggested mechanism of action and essential controls are missing to substantiate the authors claims:

      • Knockdown efficacy has to be confirmed on protein level, showing reduced levels of RAB11A and B by Western blot. This is a standard in the field. Off target effects cannot be avoided by RNAi approaches and are usually ruled out by using multiple siRNAs or by complementing the targeted protein in trans.
      • Viral titers should be presented as absolute titers not as % (here the labelling is actually misleading in all graphs indicating pfu/ml)
      • Reduction of viral protein expression goes hand in hand with a reduction in GAPDH. While this is accounted for in the quantification a general block of protein expression cannot be ruled out since the stability of house keeper proteins and viral proteins might be different. Testing multiple house keeping proteins could overcome this issue.
      • The FACS data in Fig 5 are not convincing. The previous figures showed modest reduction in viral protein expression and the fluorescence is indicated here on a logarithmic scale. Quantification and indication of mean fluorescence intensity from the same data would be a better readout to convincingly show that less cells are infected.
      • During the time of addition experiment in Fig 6, the authors are testing for HA/M2 positive cells after 16h of infection. This is a multicycle scnario so in a second round they would measure the effect of knockdown in absence of amonium chloride. Shorter infections up to 8h with higher MOI would overcome this problem.
      • Standard error of mean is not an appropriate way of representing experimental error for the provided results and should be replaced by SD. Correct labeling of axis with units is required.

      Minor comments:

      • The authors show a rescue of viral replication upon double knockdown of RAB11A and B. Maybe this is just a consequence of inefficient knockdown since only half of the siRNAs were used?
      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately?
      • Are the text and figures clear and accurate?
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Significance

      The authors claim an H3N2 specific dependency on RAB11B for early steps of infection. While this is per se interesting the provided data do not fully support the claims and lack a mechanistic explanation. What is the difference between H1 and H3 strains (virion shape, HA load per virion, attachment force of H1 vs H3). The readouts used are not close enough to the events with regards to timing and could be supported by established entry assays in the field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Turner AH. et al. demonstrated the viral replication in cells depleting Rab11B small GTPase, which is a paralogue of Rab11A. It has been reported that Rab11A is responsible for the intracellular transport of viral RNP via recycling endosomes. The authors showed that Rab11B knockdown reduced the viral protein expression and viral titer. This may be caused by reduced attachment of viral particles on Rab11B knockdown cells.

      Major comments:

      Comment 1 Fig 2-4: The authors should provide Western blot results with equal amount of loading control (GAPDH). The bands shown in these figures lack quantifiability and are not reliable as data.

      Comment 2 Fig 2-4: Why are the results different between Rab11B knockdown alone and Rab11A/B double knockdown? If the authors claims are correct, the results of Rab11B knockdown should be reproducible in Rab11A/B double knockdown cells.

      Comment 3 Fig 6: For better understanding, please provide a schematic illustration of experimental setting.

      Comment 4: It is necessary to test other siRNA sequences or perform a rescue experiment by expressing an siRNA-resistant clone in the knockdown cells. There seems to be an activation of host defense system, such as IFN pathways.

      Referees cross-commenting

      I agree with other reviewers' comments in part.

      Significance

      The authors propose a novel role for Rab11B in modulating attachment pathway of H3N2 influenza A virus by unknown mechanism. Although previous studies focus on the function of Rab11A on endocytic transport, the function and specificity of Rab11B has remained less clear. The findings may be of interest to a broad audience, including researchers in cell biology, immunology, and host-pathogen interactions. However, the study remains at a superficial level of analysis and does not lead to a deeper understanding of the underlying mechanisms.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We will provide the revised manuscript as a PDF with highlighted changes, the Word file with tracked changes linked to reviewer comments, and all updated figures.

      To address the reviewers' suggestions, we have conducted additional experiments that are now incorporated into new figures, or we have added new images to several existing figures where appropriate.

      Please note that all figures have been renumbered to improve clarity and facilitate cross-referencing throughout the text. As recommended by Referee #3, all figure legends have been thoroughly revised to reflect these updates and are now labeled following the standard A-Z panel format, enhancing readability and ensuring easier identification. In addition, all figure legends now include the sample size for each statistical analysis.

      For clarity and ease of reference, we provide below a comprehensive list of all figures included in the revised version. Figures that have undergone modifications are underlined.

      Figure 1____. The first spermatogenesis wave in prepuberal mice.

      This figure now includes amplified images of representative spermatocytes and a summary schematic illustrating the timeline of spermatogenesis. In addition, it now presents the statistical analysis of spermatocyte quantification to support the visual data.

      __Figure 2.____ Cilia emerge across all stages of prophase I in spermatocytes during the first spermatogenesis wave. __

      The images of this figure remain unchanged from the original submission, but all the graphs present now the statistical analysis of spermatocyte quantification.

      Figure 3. Ultrastructure and markers of prepuberal meiotic cilia.

      This figure remains unchanged from the original submission; however, we have replaced the ARL3-labelled spermatocyte image (A) with one displaying a clearer and more representative signal.

      __Figure 4. Testicular tissue presents spermatocyte cysts in prepuberal mice and adult humans. __

      This figure remains unchanged from the original submission.

      __Figure 5. Cilia and flagella dynamics are correlated during prepuberal meiosis. __

      This figure remains unchanged from the original submission.

      __Figure 6. Comparative proteomics identifies potential regulators of ciliogenesis and flagellogenesis. __

      This figure remains unchanged from the original submission.

      Figure 7.____ Deciliation induces persistence of DNA damage in meiosis.

      This figure has been substantially revised and now includes additional experiments analyzing chloral hydrate treatment, aimed at more accurately assessing DNA damage under both control and treated conditions. Images F-I and graph J are new.

      Figure 8____. Aurora kinase A is a regulator of cilia disassembly in meiosis.

      This figure is remodelled as the original version contained a mistake in previous panel II, for this, graph in new Fig.8 I has been corrected. In addition, it now contains additional data of αTubulin staining in arrested ciliated metaphases I after AURKA inhibition (new panel L1´).

      __Figure 9. Schematic representation of the prepuberal versus adult seminiferous epithelium. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 1. Meiotic stages during the first meiotic wave. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 2 (new)____. __

      This is a new figure that includes additional data requested by the reviewers. It includes additional markers of cilia in spermatocytes (glutamylated Tubulin/GT335), and the control data of cilia markers in non-ciliated spermatocytes. It also includes now the separated quantification of ciliated spermatocytes for each stage, as requested by reviewers, complementing graphs included in Figure 2.

      Please note that with the inclusion of this new Supplementary Figure 2, the numbering of subsequent supplementary figures has been updated accordingly.

      Supplementary Figure 3 (previously Suppl. Fig. 2)__. Ultrastructure of prophase I spermatocytes. __

      This figure is equal in content to the original submission, but some annotations have been included.

      Supplementary Figure 4 (previously Suppl. Fig. 3).__ Meiotic centrosome under the electron microscope. __

      This figure remains unchanged from the original submission, but additional annotations have been included.

      Supplementary Figure 5 (previously Suppl. Fig. 4)__. Human testis contains ciliated spermatocytes. __

      This figure has been revised and now includes additional H2AX staining to better determine the stage of ciliated spermatocytes and improve their identification.

      Supplementary Figure 6 (previously Suppl. Fig. 5). GLI1 and GLI3 readouts of Hedgehog signalling are not visibly affected in prepuberal mouse testes.

      This figure has been remodeled and now includes the quantification of GLI1 and GLI3 and its corresponding statistical analysis. It also includes the control data for Tubulin, instead of GADPH.

      Supplementary Figure 7 (previously Suppl. Fig. 6)__. CH and MLN8237 optimization protocol. __

      This figure has been remodeled to incorporate control experiments using 1-hour organotypic culture treatment.

      Supplementary Figure 8 (previously Suppl. Fig. 7)__. Tracking first meiosis wave with EdU pulse injection during prepubertal meiosis. __This figure remains unchanged from the original submission.

      Supplementary Figure 9 (previously Suppl. Fig. 8)__. PLK1 and AURKA inhibition in cultured spermatocytes. __

      This figure has been remodeled and now includes additional data on spindle detection in control and AURKA-inhibited spermatocytes (both ciliated and non ciliated).


      __Response to the reviewers __

      We will submit both the PDF version of the revised manuscript and the Word file with tracked changes relative to the original submission. Each modification made in response to reviewers' suggestions is annotated in the Word document within the corresponding section of the text.

      A detailed, point-by-point response to each reviewer's comments is provided in the following section.

      Response to the Referee #1


      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Author response: We sincerely thank Ref #1 for the thorough and thoughtful evaluation of our manuscript. We are particularly grateful for the reviewer's careful reading and constructive feedback, which have helped us refine several sections of the text and strengthen our discussion. All comments and suggestions have been carefully considered and addressed, as detailed below.


      __Major comments: __

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      Response:

      We thank Ref #1 for this valuable comment, which significantly contributed to improving both the design and interpretation of the cilia depolymerization assay.

      Following this suggestion, we repeated the experiment including 1-hour (immediately after culturing), and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). To ensure accurate staging, we now employ triple immunolabelling for γH2AX, SYCP3, and H1T, allowing clear distinction of zygotene (H1T−), early pachytene (H1T−), and late pachytene (H1T+) cells. The revised data (Figure 7) now provide a more complete and statistically robust analysis of DNA damage dynamics. These results confirm that CH-induced deciliation leads to persistence of the γH2AX signal at 24 hours, indicating impaired DNA repair progression in pachytene spermatocytes. The new images and graphs are included in the revised Figure 7.

      Regarding the reviewer's final point about the comparison of γH2AX levels between ciliated and non-ciliated cells, we regret that direct comparison of γH2AX levels between ciliated and non-ciliated cells is not technically feasible. To preserve cilia integrity, all cilia-related imaging is performed using the squash technique, which maintains the three-dimensional structure of the cilia but does not allow reliable quantification of DNA damage markers due to nuclear distortion. Conversely, the nuclear spreading technique, used for DNA damage assessment, provides optimal visualization of repair foci but results in the loss of cilia due to cytoplasmic disruption during the hypotonic step. Given that spermatocytes in juvenile testes form developmentally synchronized cytoplasmic cysts, we consider that analyzing a statistically representative number of spermatocytes offers a valid and biologically meaningful measure of tissue-level effects.

      In conclusion, we believe that the additional experiments and clarifications included in revised Figure 7 strengthen our conclusion that cilia depolymerization compromises DNA repair during meiosis. Further functional confirmation will be pursued in future works, since we are currently generating a conditional genetic model for a ciliopathy in our laboratory.

      The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      Response:

      We thank Ref#1 for identifying this issue and for the careful examination of Figure 8. We discovered that the submitted version of Figure 8 contained a mismatch between the figure legend and the figure panels. The legend text was correct; however, the figure inadvertently included a non-corresponding graph (previously panel II-A), which actually belonged to Supplementary Figure 7 in the original submission. We apologize for this mistake.

      This error has been corrected in the revised version. The updated Figure 8 now accurately presents the distribution of EdU-labelled spermatocytes across prophase I substages in control and AURKA-inhibited cultures (previously Figure 8-II B, now Figure 8-A). The corrected data show no significant differences in the proportions of EdU-labelled spermatocytes among prophase I substages after 24 hours of AURKA inhibition, confirming that meiotic progression is not delayed and that no accumulation of zygotene cells occurs under this treatment. Therefore, the observed increase in ciliated zygotene spermatocytes upon AURKA inhibition (new Figure 8 H-I) is best explained by a delay in cilia disassembly, rather than by an arrest or slowdown in meiotic progression. The figure legend and main text have been revised accordingly.

      How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      Response:

      We appreciate Ref#1 for this excellent suggestion. In the original submission (lines 446-447), we described that ciliated metaphase I spermatocytes in AURKA-inhibited samples exhibited monopolar spindle phenotypes. This description was based on previous reports showing that AURKA or PLK1 inhibition produces metaphases with monopolar spindles characterized by aberrant yet characteristic SYCP3 patterns, abnormal chromatin compaction, and circular bivalent alignment around non-migrated centrosomes (1). In our study, we observed SYCP3 staining consistent with these characteristic features of monopolar metaphases I.

      However, we agree with Ref #1 that this could be better sustained with data. Following the reviewer's suggestion, we performed additional immunostaining using α-Tubulin, which labels total microtubules rather than only the acetylated fraction. For clarity purposes, the revised Figure 8 now includes α-Tubulin staining in the same ciliated metaphase I cells shown in the original submission, confirming the presence of defective microtubule polymerization and defective spindle organization. For clarity, we now refer to these ciliated metaphases I as "arrested MI". This new data further support our conclusion that AURKA inhibition disrupts spindle bipolarization and prevents cilia depolymerization, indicating that cilia maintenance and bipolar spindle organization are mechanistically incompatible events during male meiosis. The abstract, results, and discussion section has been expanded accordingly, emphasizing that the persistence of cilia may interfere with microtubule polymerization and centrosome separation under AURKA inhibition. The Discussion has been expanded to emphasize that persistence of cilia may interfere with centrosome separation and microtubule polymerization, contrasting with invertebrate systems -e.g. Drosophila (2) and P. brassicae (3)- in which meiotic cilia persist through metaphase I without impairing bipolar spindle assembly.

      1. Alfaro, et al. EMBO Rep 22, (2021). DOI: 15252/embr.202051030 (PMID: 33615693)
      2. Riparbelli et al . Dev Cell (2012) DOI: 1016/j.devcel.2012.05.024 (PMID: 22898783)
      3. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 1002/cm.21755 (PMID: 37036073)

      The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Response:

      We thank Ref#1 for this valuable observation, with which we fully agree. To avoid overstatement, the original statement has been removed from the Abstract, Results, and Discussion, and replaced with a more accurate formulation indicating that cilia maintenance and bipolar spindle formation are mutually exclusive events during mouse meiosis.

      This revised statement is now directly supported by the new data presented in Figure 8, which demonstrate that AURKA inhibition prevents both spindle bipolarization and cilia depolymerization. We are grateful to the reviewer for highlighting this important clarification.


      Minor comments:

      The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      Response:

      We thank the reviewer for these thoughtful observations, which we agree are indeed intriguing.

      We believe that our findings likely reflect a developmental role for primary cilia during testicular maturation. We hypothesize that primary cilia at this stage might act as signaling organelles, receiving cues from Sertoli cells or neighboring spermatocytes and transmitting them through the cytoplasmic cysts shared by spermatocytes. Such intercellular communication could be essential for coordinating tissue maturation and meiotic entry during puberty. Although speculative, this hypothesis aligns with the established role of primary cilia as sensory and signaling hubs for GPCR and RTK pathways regulating cell differentiation and developmental patterning in multiple tissues (e.g., 1, 2). The Discussion section has been expanded to include these considerations.

      1. Goetz et al, Nat Rev Genet (2010)- DOI: 1038/nrg2774 (PMID: 20395968)
      2. Naturky et al , Cell (2019) DOI: 1038/s41580-019-0116-4 (PMID: 30948801) Our study focuses on the first spermatogenic wave, which represents the transition from the juvenile to the reproductive phase. It is therefore plausible that the transient presence of longer cilia during this period reflects a developmental requirement for external signaling that becomes dispensable in the mature testis. Given that this is only the second study to date examining mammalian meiotic cilia, there remains a vast area of research to explore. We plan to address potential signaling cascades involved in these processes in future studies.

      On the other hand, while we cannot confirm that the cilia observed in zygotene spermatocytes persist until pachytene within the same cell, it is reasonable to speculate that they do, serving as longer-lasting signaling structures that facilitate testicular development during the critical pubertal window. In addition, the observation of ciliated spermatocytes at all prophase I substages at 20 dpp, together with our proteomic data, supports the idea that the emergence of meiotic cilia exerts a significant developmental impact on testicular maturation.

      In summary, although we cannot yet define specific prophase I functions for meiotic cilia in juvenile spermatocytes, our data demonstrate that the first meiotic wave differs from later waves in cilia dynamics, suggesting distinct regulatory requirements between puberty and adulthood. These findings underscore the importance of considering developmental context when using the first meiotic wave as a model for studying spermatogenesis.

      The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      Response:

      We thank Ref#1 for this excellent question and for the opportunity to clarify our statement.

      The presence of intercellular bridges between spermatocytes is well known and has long been proposed to support germ cell communication and synchronization (1,2) as well as sharing mRNA (3) and organelles (4). A classic example is the Akap gene, located on the X chromosome and essential for the formation of the sperm fibrous sheath; cytoplasmic continuity through intercellular bridges allows Akap-derived products to be shared between X- and Y-bearing spermatids, thereby maintaining phenotypic balance despite transcriptional asymmetry (5). In addition, more recent work has further demonstrated that these bridges are critical for synchronizing meiotic progression and for processes such as synapsis, double-strand break repair, and transposon repression (6).

      In this context, and considering our proteomic data (Figure 6), our statement did not intend to imply direct cytoplasmic exchange between ciliated and flagellated cells. Although our current methods do not allow comprehensive tracing of cytoplasmic continuity from the basal to the luminal compartment of the seminiferous epithelium, we plan to address this limitation using high-resolution 3D and ultrastructural imaging approaches in future studies.

      Based on our current data, we propose that cytoplasmic continuity within developmentally synchronized spermatocyte cysts could facilitate the coordinated regulation of ciliogenesis, and similarly enable the sharing of regulatory factors controlling flagellogenesis within spermatid cysts. This coordination may occur through the diffusion of centrosomal or ciliary proteins, mRNAs, or signaling intermediates involved in the regulation of microtubule dynamics. However, we cannot exclude the possibility that such cytoplasmic continuity extends across all spermatocytes derived from the same spermatogonial clone, potentially providing a larger regulatory network.]] This mechanism could help explain the temporal correlation we observe between the appearance of meiotic cilia and the onset of flagella formation in adjacent spermatids within the same seminiferous segment.

      We have revised the Discussion to explicitly clarify this interpretation and to note that, although hypothetical, it is consistent with established literature on cytoplasmic continuity and germ cell coordination.

      1. Dym, et al. * Reprod.*(1971) DOI: 10.1093/biolreprod/4.2.195 (PMID: 4107186)
      2. Braun et al. Nature. (1989) DOI: 1038/337373a0 (PMID: 2911388)
      3. Greenbaum et al. * Natl. Acad. Sci. USA*(2006). DOI: 10.1073/pnas.0505123103 (PMID: 16549803)
      4. Ventelä et al. Mol Biol Cell. (2003) DOI: 1091/mbc.e02-10-0647 (PMID: 12857863)
      5. Turner et al. Journal of Biological Chemistry (1998). DOI: 1074/jbc.273.48.32135 (PMID: 9822690)
      6. Sorkin, et al. Nat Commun (2025). DOI: 1038/s41467-025-56742-9 (PMID: 39929837)
      7. *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.*

      Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      Response:

      Ref#1 is very right in this suggestion. We have revised Figure 1 to improve the quality of the H&E-stained testis sections and have added zoomed-in panels where spermatocytes, round spermatids, and elongated spermatids are clearly distinguishable. These additions significantly enhance the clarity and interpretability of the figure.

      In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      Response:

      We thank the reviewer for this valuable observation. Indeed, the predominance of ciliated pachytene spermatocytes reflects the fact that most meiotic cells in juvenile testes are at the pachytene stage (Figure 1). We have clarified this point in the text and have added a new supplementary figure (Supplementary Figure 2, new figure) presenting a graph showing the proportion of spermatocytes at each prophase I substage that possess primary cilia. This visualization provides a clearer quantitative overview of ciliation dynamics across meiotic substages.

      I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      Response:

      We thank the reviewer for this helpful suggestion. We have now added annotations to the EM images in Supplementary Figures 3 and 4 to facilitate their interpretation. These visual guides help readers more easily identify the relevant ultrastructural features described in the text.

      The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      Response:

      We thank the reviewer for this valuable suggestion. Following this recommendation, Supplementary Figure 5 has been revised to include quantification of GLI1 and GLI3 protein levels, normalized to the loading control.

      After quantification, we observed statistically significant differences across developmental stages. Specifically, GLI1 expression is slightly higher at 21 dpp compared to 8 dpp. For GLI3, we performed two complementary analyses:

      • Total GLI3 protein (sum of full-length and repressor forms normalized to loading control) shows a progressive decrease during development, with the lowest levels at 60 dpp (Supplementary Figure 5D).
      • GLI3 activation status, assessed as the GLI3-FL/GLI3-R ratio, is highest during the 19-21 dpp window, compared to 8 dpp and 60 dpp. Although these results suggest a possible transient activation of GLI3 during testicular maturation, we caution that this cannot automatically be attributed to increased Hedgehog signaling, as GLI3 processing can also be affected by other processes, such as changes in ciliogenesis. Furthermore, because the analysis was performed on whole-testis protein extracts, these changes cannot be specifically assigned to ciliated spermatocytes.

      We have expanded the Discussion to address these findings and to highlight the potential involvement of the Desert Hedgehog (DHH) pathway, which plays key roles in testicular development, Sertoli-germ cell communication, and spermatogenesis (1, 2, 3). We plan to investigate these pathways further in future studies.

      1. Bitgood et al. Curr Biol. (1996). DOI: 1016/s0960-9822(02)00480-3 (PMID: 8805249)
      2. Clark et al. Biol Reprod. (2000) DOI: 1095/biolreprod63.6.1825 (PMID: 11090455)
      3. O'Hara et al. BMC Dev Biol. (2011) DOI: 1186/1471-213X-11-72 (PMID: 22132805) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Response:

      We thank the reviewer for detecting this. All typographical errors have been corrected, and figure callouts have been reviewed for consistency.

      __ ____Response to the Referee #2__

      __ __This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      We thank Ref#2 for taking the time to evaluate our manuscript and for summarizing its main findings. We regret that the reviewer did not find the study sufficiently compelling, but we respectfully clarify that the strength of our work lies precisely in addressing a largely unexplored aspect of mammalian meiosis for which virtually no prior data exist. Given the extremely limited number of studies addressing cilia in mammalian meiosis (only five to date, including our own previous publication on adult mouse spermatogenesis) (1-5), we consider that the present work provides the first robust and integrative evidence on the emergence, morphology, and potential roles of primary cilia during prepubertal testicular development. The study combines histology, high-resolution microscopy, proteomics, and pharmacological perturbations, supported by quantitative analyses, thereby establishing a solid and much-needed reference framework for future functional studies.

      We emphasize that this manuscript constitutes the first comprehensive characterization of ciliogenesis during prepubertal mouse meiosis, complemented by functional in vitro assays that begin to address potential roles of these cilia. For this reason, we want to underscore the importance of this study in providing a solid framework that will support and guide future research

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      Response:

      We agree that a genetic ablation model would represent the ideal approach to directly test cilia function in spermatogenesis. However, given the complete absence of prior data describing the dynamics of ciliogenesis during testis development, our priority in this study was to establish a rigorous structural and temporal characterization of this process in the main mammalian model organism, the mouse. This systematic and rigorous phenotypic characterization is a necessary first step before any functional genetics could be meaningfully interpreted.

      To our knowledge, this study represents the first comprehensive analysis of ciliogenesis during prepubertal mouse meiosis, extending our previous work on adult spermatogenesis (1). Beyond these two contributions, only four additional studies have addressed meiotic cilia-two in zebrafish (2, 3), with Mytlys et al. also providing preliminary observations relevant to prepubertal male meiosis that we discuss in the present work, one in Drosophila (4) and a recent one in butterfly (5). No additional information exists for mammalian gametogenesis to date.

      1. López-Jiménez et al. Cells (2022) DOI: 10.3390/cells12010142 (PMID: 36611937)
      2. Mytlis et al. Science (2022) DOI: 10.1126/science.abh3104 (PMID: 35549308)
      3. Xie et al. J Mol Cell Biol (2022) DOI: 10.1093/jmcb/mjac049 (PMID: 35981808)
      4. Riparbelli et al . Dev Cell (2012) DOI: 10.1016/j.devcel.2012.05.024 (PMID: 22898783)
      5. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 10.1002/cm.21755 (PMID: 37036073) We therefore consider this descriptive and analytical foundation to be essential before the development of functional genetic models. Indeed, we are currently generating a conditional genetic model for a ciliopathy in our laboratory. These studies are ongoing and will directly address the type of mechanistic questions raised here, but they extend well beyond the scope and feasible timeframe of the present manuscript.

      We thus maintain that the present work constitutes a necessary and timely contribution, providing a robust reference dataset that will facilitate and guide future functional studies in the field of cilia and meiosis.

      Taking this into account, we would be very pleased to address any additional, concrete suggestions from Ref#2 that could further strengthen the current version of the manuscript

      The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Response:

      We appreciate this comment but respectfully disagree with the reviewer's interpretation of our proteomic data. To our knowledge, this is the first proteomic study explicitly focused on identifying ciliary regulators during testicular development at the precise window (19-21 dpp) when both meiotic cilia and spermatid flagella first emerge.

      While Piprek et al (1) analyzed the expression of primary cilia in developing gonads, proteomic data specifically covering the developmental transition at 19-21 dpp were not previously available. Furthermore, a recent cell-sorting study (2), detected expression of cilia proteins in pachytene spermatocytes compared to round spermatids, but did not explore their functional relevance or integrate these data with developmental timing or histological context.

      In contrast, our dataset integrates histological staging, high-resolution microscopy, and quantitative proteomics, revealing a set of candidate regulators (including DCAF7, DYRK1A, TUBB3, TUBB4B, and TRiC) potentially involved in cilia-flagella coordination. We view this as a hypothesis-generating resource that outlines specific proteins and pathways for future mechanistic studies on both ciliogenesis and flagellogenesis in the testis.

      Although we fully agree that proteomics alone cannot establish causal function, we believe that dismissing these data as having little significance overlooks their value as the first molecular map of the testis at the developmental window when axonemal structures arise. Our dataset provides, for the first time, an integrated view of proteins associated with ciliary and flagellar structures at the developmental stage when both axonemal organelles first appear. We thus believe that our proteomic dataset represents an important and novel contribution to the understanding of testicular development and ciliary biology.

      Considering this, we would again welcome any specific suggestions from Ref#2 on additional analyses or clarifications that could make the relevance of this dataset even clearer to readers.

      1. Piprek et al. Int J Dev Biol. (2019) doi: 10.1387/ijdb.190049rp (PMID: 32149371).
      2. Fang et al. Chromosoma. (1981) doi: 10.1007/BF00285768 (PMID: 7227045).

      Response to the Referee #3

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Response: We sincerely thank Ref#3 for their positive assessment of our work and for the thoughtful suggestions that have helped us strengthen the manuscript. We are pleased that the reviewer recognizes both the novelty and the relevance of our study in providing foundational insights into meiotic ciliogenesis during prepubertal testicular development. All specific comments have been carefully considered and addressed as detailed below.


      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      Response:

      We thank the reviewer for this helpful suggestion. In the revised version, we have strengthened the evidence for cilia identification by including an additional ciliary marker, glutamylated tubulin (GT335), in combination with acetylated tubulin and ARL13B (which were included in the original submission). These data are now presented in the new Supplementary Figure 2, which also includes an example of a non-ciliated spermatocyte showing absence of both ARL13B and AcTub signals.

      Taken together, these markers provide a more comprehensive validation of cilia detection and confirm the absence of ciliary labelling in non-ciliated spermatocytes.

      The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      Response:

      We appreciate the reviewer's concern and fully agree that antibody specificity is critical when interpreting centrosomal localization. The IFT88 antibody used in our study is commercially available and has been extensively validated in the literature as both a cilia marker (1, 2), and a centrosome marker in somatic cells (3). Labelling of IFT88 in centrosomes has also been previously described using other antibodies (4, 5). In our material, the IFT88 signal consistently appears at one of the duplicated centrosomes and at both spindle poles-patterns identical to those reported in somatic cells. We therefore consider the reported meiotic IFT88 staining as specific and biologically reliable.

      That said, we agree that genetic validation would provide the most definitive confirmation. We would like to inform that we are currently since we are currently generating a conditional genetic model for a ciliopathy in our laboratory that will directly assess both antibody specificity and functional consequences of cilia loss during meiosis. These experiments are in progress and will be reported in a follow-up study.

      1. Wong et al. Science (2015). DOI: 1126/science.aaa5111 (PMID: 25931445)
      2. Ocbina et al. Nat Genet (2011). DOI: 1038/ng.832 (PMID: 21552265)
      3. Vitre et al. EMBO Rep (2020). DOI: 15252/embr.201949234 (PMID: 32270908)
      4. Robert A. et al. J Cell Sci (2007). DOI: 1242/jcs.03366 (PMID: 17264151)
      5. Singla et al, Developmental Cell (2010). DOI: 10.1016/j.devcel.2009.12.022 (PMID: 20230748) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      Response:

      We thank the reviewer for identifying this inconsistency and apologize for the confusion. We confirm that early round spermatids first appear at 19 dpp, as shown in the quantitative data (Figure 1J). This can be detected in squashed spermatocyte preparations, where individual spermatocytes and spermatids can be accurately quantified. The original text contained an imprecise reference to the histological image of 21 dpp (previous line 161), since certain H&E sections did not clearly show all cell types simultaneously. However, we have now revised Figure 1, improving the image quality and adding a zoomed-in panel highlighting early round spermatids. Image for 19 dpp mice in Fig 1D shows early, yet still aflagellated spermatids. The first ciliated spermatocytes and the earliest flagellated spermatids are observed at 20 dpp. This has been clarified in the text.

      In addition, we also thank the reviewer for the suggestion of adding a summary graphic, which we agree greatly facilitates reader comprehension. We have added a new schematic summary (Figure 1K) illustrating the key stages and timing of the first spermatogenic wave.

      In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      Response:

      We agree with the reviewer that our proteomic analysis was performed on whole testis samples, which contain both germ and somatic cells. Although isolation of pure spermatocyte populations by FACS would provide higher resolution, obtaining sufficient prepubertal material for such analysis would require an extremely large number of animals. To remain compliant with the 3Rs principle for animal experimentation, we therefore used whole-testis samples from three biological replicates per age.

      We acknowledge that our assumption-that the main differences arise from germ cells-is a simplification. However, germ cells constitute the vast majority of testicular cells during this developmental window and are the population undergoing major compositional changes between 15 dpp and adulthood. It is therefore reasonable to expect that a substantial fraction of the observed proteomic changes reflects alterations in germ cells. We have clarified this point in the revised text and have added a statement noting that changes in somatic cells could also contribute to the proteomic profiles.

      The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      Response:

      We thank the reviewer for this opportunity to clarify our approach. The categorization of protein as being involved in ciliogenesis or flagellogenesis was based on their Gene Ontology (GO) cellular component annotations obtained from the PANTHER database (Version 19.0), using the gene IDs of the Differentially Expressed Proteins (DEPs). Specifically, we used the GO terms cilium (GO:0005929) and motile cilium (GO:0031514). Since motile cilium is a subcategory of cilium, proteins annotated only with the general cilium term, but not included under motile cilium, were considered to be associated with primary cilia or with shared structural components common to different types of cilia. These GO terms are represented in the bottom panel of the Figure 6.

      This information has been added to the Methods section and referenced in the Results for transparency and reproducibility.

      In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      Response:

      We thank the reviewer for this fair observation and have taken steps to strengthen and refine our interpretation. In the revised version, we now include data from 1-hour and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). The triple immunolabelling with γH2AX, SYCP3, and H1T allows accurate staging of zygotene (H1T⁻), early pachytene (H1T⁻), and late pachytene (H1T⁺) spermatocytes.

      The revised Figure 7 now provides a more complete and statistically supported analysis of DNA damage dynamics, confirming that CH-induced deciliation leads to persistent γH2AX signal at 24 hours, indicative of delayed or defective DNA repair progression. We have also toned down our interpretation in the Discussion, acknowledging that CH could affect other cellular pathways.

      As mentioned before, the conditional genetic model that we are currently generating will allow us to evaluate the role of cilia in meiotic DNA repair in a more direct and specific way.

      Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      Response:

      We agree that this aspect required clarification. As noted above, we have refined both the Results and Discussion sections to make clear that our assays specifically targeted meiotic spermatocytes.

      We now present data for meiotic stages at zygotene, early pachytene and late pachytene. This is demonstrated with the labelling for SYCP3 and H1T, both specific marker for meiosis that are not detectable in non meiotic cells. We believe that this is indeed a way to assay the meiotic cells, however, we have specified now in the text that we are analysing potential defects in meiosis progression. We are sorry if this was not properly explained in the original manuscript: it is now rephrased in the new version both in the results and discussion section.

      It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      Response:

      We agree with the reviewer that measuring mRNA levels of Hedgehog pathway target genes, typically GLI1 and PTCH1, is the most common method for measuring pathway activation, and is widely accepted by researchers in the field. However, the methods we use in this manuscript (GLI1 and GLI3 immunoblots) are also quite common and widely accepted:

      Regarding GLI1 immunoblot, many articles have used this method to monitor Hedgehog signaling, since GLI1 protein levels have repeatedly been shown to also go up upon pathway activation, and down upon pathway inhibition, mirroring the behavior of GLI1 mRNA. Here are a few publications that exemplify this point:

      • Banday et al. 2025 Nat Commun. DOI: 10.1038/s41467-025-56632-0 (PMID: 39894896)
      • Shi et al 2022 JCI Insight DOI: 10.1172/jci.insight.149626 (PMID: 35041619)
      • Deng et al. 2019 eLife, DOI: 10.7554/eLife.50208 (PMID: 31482846)
      • Zhu et al. 2019 Nat Commun, DOI: 10.1038/s41467-019-10739-3 (PMID: 31253779)
      • Caparros-Martin et al 2013 Hum Mol Genet, DOI: 10.1093/hmg/dds409 (PMID: 23026747) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      As for GLI3 immunoblot, Hedgehog pathway activation is well known to inhibit GLI3 proteolytic processing from its full length form (GLI3-FL) to its transcriptional repressor (GLI3-R), and such processing is also commonly used to monitor Hedgehog signal transduction, of which the following are but a few examples:

      • Pedraza et al 2025 eLife, DOI: 10.7554/eLife.100328 (PMID: 40956303)
      • Somatilaka et al 2020 Dev Cell, DOI: 10.1016/j.devcel.2020.06.034 (PMID: 32702291)
      • Infante et al 2018, Nat Commun, DOI: 10.1038/s41467-018-03339-0 (PMID: 29515120)
      • Wang et al 2017 Dev Biol DOI: 10.1016/j.ydbio.2017.08.003 (PMID: 28800946)
      • Singh et al 2015 J Biol Chem DOI: 10.1074/jbc.M115.665810 (PMID: 26451044)
      • *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.*

      In summary, we think that we have used two well established markers to look at Hedgehog signaling (three, if we include the immunofluorescence analysis of SMO, which we could not detect in meiotic cilia).

      These Hh pathway analyses did not provide any convincing evidence that the prepubertal cilia we describe here are actively involved in this pathway, even though Hh signaling is cilia-dependent and is known to be active in the male germline (Sahin et al 2014 Andrology PMID: 24574096; Mäkelä et al 2011 Reproduction PMID: 21893610; Bitgood et al 1996 Curr Biol. PMID: 8805249).

      That said, we fully agree that our current analyses do not allow us to draw definitive conclusions regarding Hedgehog pathway activity in meiotic cilia, and we now state this explicitly in the revised Discussion.

      Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      Response:

      It is true that, when Hh signaling is inactive (and hence SMO not ciliary), the GLI3FL/GLI3R ratio tends to be low.

      Although our data in prepuberal mouse testes show a strong reduction in total GLI3 protein levels (GLI3FL+GLI3R) as these mice grow older, this downregulation of total GLI3 occurs without any major changes in the GLI3FL/GLI3R ratio, which is only modestly affected (suppl. Figure 6).

      Hence, since it is the ratio that correlates with Hh signaling rather than total levels, we do not think that the GLI3R reduction we see is incompatible with our non-detection of SMO in cilia: it seems more likely that overall GLI3 expression is being downregulated in developing testes via a Hh-independent mechanism.

      Also potentially relevant here is the fact that some cell types depend more on GLI2 than on GLI3 for Hh signaling. For instance, in mouse embryos, Hh-mediated neural tube patterning relies more heavily on GLI2 processing into a transcriptional activator than on the inhibition of GLI3 processing into a repressor. In contrast, the opposite is true during Hh-mediated limb bud patterning (Nieuwenhuis and Hui 2005 Clin Genet. PMID: 15691355). We have not looked at GLI2, but it is conceivable that it could play a bigger role than GLI3 in our model.

      Moreover, several forms of GLI-independent non-canonical Hh signaling have been described, and they could potentially play a role in our model, too (Robbins et al 2012 Sci Signal. PMID: 23074268).

      We have revised the discussion to clarify some of these points.

      All in all, we agree that our findings regarding Hh signaling are not conclusive, but we still think they add important pieces to the puzzle that will help guide future studies.

      There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      We thank the reviewer for highlighting this important issue. We have now included the sample size (n) for every analysis directly in the figure legends. Although this adds length, it improves transparency and reproducibility.

      Regarding the doubts of Ref#3 about the different sample sizes, the number of spermatocytes quantified in each stage is in agreement with their distribution in meiosis (example, pachytene lasts for 10 days this stage is widely represented in the preparations, while its is much difficult to quantify metaphases I that are less present because the stage itself lasts for less than 24hours). Taking this into account, we ensured that all analyses remain statistically valid and representative, applying the appropriate statistical tests for each dataset. These details are now clearly indicated in the revised figures and legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      Response:

      We thank the reviewer for noticing this terminology error. The expression has been corrected to "pre-weaning males" throughout the manuscript.

      The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Response:

      We thank the reviewer for this suggestion. All figures have been relabelled using the standard A-Z panel format, ensuring consistency and easier readability across the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      2. The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      3. There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      4. In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      5. The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      6. In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      7. Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      8. It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      9. Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      10. There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      2. The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Significance

      Overall, this is a well-done body of work that deserves recognition for the novel and implicative discoveries it presents. Assuming the conclusions hold true following appropriate statistical analysis and rephrasing, this paper would report the first documented evidence of meiotic cilia in the developing mammalian testis with sufficient rigor to become the foundational work on this topic.

      This paper will be of interest to communities focused on germ cell development, cilia, and Hedgehog signaling. It may prompt a new perspective on Desert Hedgehog signaling as it pertains to spermatogenesis. Further, this work will be of interest to those studying male fertility, as it highlights the potential role of cilia in spermatogenesis.

      Further, the proteomic analysis presented has the potential to invoke hypotheses and experimentation investigating the role of several proteins with previously uncharacterized roles in ciliogenesis, flagellogenesis, and/or spermatogenesis. The finding that the onset of ciliogenesis and flagellogenesis appear to be temporally linked has the potential to prompt research regarding shared molecular mechanisms dictating axonemal formation. We believe this paper has the potential to have an impact in its respective field, underscored by the exquisite microscopy and detailed characterization of meiotic cilia.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      2. The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Significance

      Strengths: The discovery of a very interesting time window for ciliary growth in spermatocytes.

      Weaknesses: Insufficient analysis of the function of such cilia.

      Readers: Developmental biologists, reproductive biologists

      My expertise: Spermatogenesis, genetics

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Few suggestions/comments are listed below:

      Major comments

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      2. The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      3. How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      4. The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Minor comments

      1. The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      1. The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      2. Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      3. In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      4. I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      5. The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      6. There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Significance

      This work provides new information about an important but poorly understood cellular structure present in meiotic cells, the primary cilium. More generally, this work expands on our understanding of testis development in juvenile mice. The microscopy images presented here are beautiful. The work is mostly descriptive but lays the groundwork for future investigations. I believe that this study would of interest to the germ cell, meiosis, and spermatogenesis communities, and with a few modifications, is suitable for publication.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the three reviewers for their careful reading of our manuscript and suggested modifications. We have incorporated their suggestions as described below; these changes have significantly improved the structure and focus of the manuscript.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Summary

      The possibility of observing 3D cellular organisation in tissues at nanometre resolution is a hope for many cell biologists. Here, the authors have combined two volume electron microscopy approaches with scanning electron microscopy: Focused Ion Beam (FIB-SEM) and Array Tomography (AT-SEM) to study the evolution of the shape and organisation of cytoplasmic bridges, the 'ring canals' (RCs) in the Drosophila ovarian follicle that connect nurse cells and oocyte. This type of cytoplasmic link, found in insects and humans, is essential for oocyte development.

      RCs have mainly been studied using light microscopy with various markers that constitute them, but this approach does not fully capture an overall view of their organization. Due to their three-dimensional arrangement within the ovarian follicle, characterizing their organization using transmission electron microscopy (TEM) has been very limited until now. This v-EM study allows the authors to document the evolution of RC size and thickness during the development of germline cysts, from the germarium to stage 4, and potentially beyond. This study confirmed previous findings, namely that RC size correlates with lineage: the largest RC is formed after the first division, while the smallest is formed during the last division.

      Furthermore, this work allowed a better characterisation of the membrane interdigitation surrounding the RCs. In addition, the authors highlight the important potential of v-EM for further structural analysis of the fusome, migrating border cells and the stem cell niche.

      Majors comment

      The output of this work can be divided into two parts. First, this work presents a technical challenge, involving image acquisition by volume electron microscopy and manual 3D reconstruction of the contours of the membranes, nuclei, RCs, and fusome in different cysts at different stages.

      Secondly, this work is based on a structural study of the RCs and their associated membranes. This work is descriptive but important, although the results largely confirm previous findings, both for the structure of the RCs and their relationship to the division sequence of the cyst cells, and for the organisation of the membranes around the RCs.

      Very interestingly, the authors report the spatial characterisation of membrane structures associated with and close to CRs that have already been identified (Loyer et al.). However, their characterisation is somewhat incomplete, as it lacks quantified data - how many CRs were analysed? and, above all, the characteristics of these membranes, their length and orientation according to their position and their connection in the lineage - these data could be obtained from the VEM data already collected and would be an important addition to the RC structural analysis in this work.

      *Following the suggestions of this reviewer, we have reduced the emphasis on the technical approach to better highlight the ring canal data. We have summarized the ring canal measurements in graphs presented in Fig. 4B, C and included the sample sizes for these measurements in the figure legend. *

      • To gain further insight into the membrane interdigitations, we have developed a detailed model of the oocyte and four ring canals that connect to the posterior nurse cells of the stage 4 egg chamber (Fig. 5). From this model, we see that the interdigitations are longer and more abundant that in the germarium (Fig. S5), but not as extensive as in the stage 8 egg chamber (Fig. 6). The interdigitations were not all oriented in the same direction, and we did not observe an obvious correlation between interdigitation number, orientation, and lineage. We plan to continue to explore these structures in future studies. *

      In line with this, the authors importantly report the presence of an ER-like membrane structure lining the RCs. First, it would be nice to have statistics to support the observation of how many RCs..? Secondly, does this ER membrane structure vary according to the position of the RC in the cyst, are they related to the RC lineage?

      *We appreciate the reviewer's interest in this novel ER-like structure lining the ring canals. We have generated a detailed model of these structures within the stage 4 egg chamber (Fig. 5D,E). However, because we do not have data from a large number of egg chambers, we believe that performing statistics would not be appropriate. *

      The addition of graphs showing the quantitative data with statistics in the figures would improve understanding of the results. This is particularly the case for the characterisation of RCs according to the stage of cyst development, as shown in Figure 3. This also applies to the characterisation of RCs within a cyst and the relationship between RC size and lineage, as shown in Figure 4, and to the characterisation (thickness) of the inner part of the RC.

      *We have included graphs of ring canal diameter based on stage (Fig. 4B) or lineage (Fig. 4C); however, because we only have data from a few germline cysts, we have not performed any statistical analysis. *

      The part on the structural analysis of the fusome is interesting but still secondary to the characterisation of the RCs. This part should be moved to the results and figures after the various parts concerning the RCs.

              *We have deemphasized the fusome structural analysis in the results section; however, we chose to leave these images in the figures, since there could be a connection between the novel ER-like structures and the fusome.  *
      

      Minor comments The distribution of the fusome in Figure 2 is difficult to see with Hts labelling and does not really correspond to the schematic, especially in regions 2a and 2B.

      *We have modified the images and the schematic. *

      In panel C of Figure 2, it is a little disturbing that the legend is directly on the image of RC. It hides some information about the images and could be placed at the bottom of the panel. This also the case for the panel G.

      We understand the possible confusion and have changed the layout in the figure.

      With figure 3B, it would be good to highlight the position of cyst.

      We have pseudocolored the portion that corresponds to the relevant cyst in the same color used for the reconstruction (which is now Fig. 3A).

      Reviewer #1 (Significance (Required)): As mentioned above, this work can be divided into two parts. The part corresponding to the acquisition of images by volume electron microscopy and manual 3D reconstruction is new and a great source of valuable information. The part related to the spatial characterisation of the RC is important, but corresponds more to an extension and reinforcement of previously available information than to the contribution of significant new insights. I think it will be of great interest to an audience interested in Drosophila oogenesis.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study presents a high-resolution volumetric analysis of germline ring canals (RCs) during Drosophila oogenesis. By combining two complementary electron microscopy techniques-Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) and Array Tomography Scanning Electron Microscopy (AT-SEM)-the authors compare RC structural features at different developmental stages, ranging from the relatively small germarium to the significantly larger, later-stage egg chambers.

      At early stages of oogenesis, FIB-SEM analysis confirms that the average RC size increases progressively with cyst development, in agreement with previous studies. The authors further show that lineage reliably predicts RC size (an observation previously reported, but here identified at an earlier stage in region 2a) and, importantly, that the thickness of the actin rim can also be predicted by lineage (reported here for the first time, at stage 1). FIB-SEM analysis also enables a clear delineation of the fusome, allowing for detailed characterization of its assembly and disassembly. Notably, the authors report, for the first time, structural evidence of ER-like membranes capping the inner rim of actin RCs.

      At later developmental stages, AT-SEM analysis reveals that the microvilli observed by FIB-SEM evolve into extensive interdigitations extending beyond the outer rim in mid-stage egg chambers, a structural feature detected earlier than previously reported. Moreover, by analyzing a sample in which tissue organization was disrupted during preparation, the authors demonstrate that these interdigitations preferentially occur in proximity to the RC. In addition to RC analysis at later stages, the authors use AT-SEM to readily identify small cell populations, such as the germline stem cell niche and border cells, and provide high-resolution volumetric EM data for these structures.

      MAJOR COMMENT My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy.

      Although TEM data has been used to perform foundational studies in the field, there are limitations to this approach. Due to the size of the ring canals, it is challenging to locate them within the large volume of the egg chamber (especially at later stages). Even if ring canals can be located, they are typically not oriented the same way, so a single section is not sufficient. *Although some of the results shown by our complementary vEM approaches do confirm results that have been previously reported by TEM or fluorescence microscopy, our approach provides important additional insight into structures that have been studied for many decades that would not be possible using other approaches. Further, this approach has identified a novel membrane structure lining the ring canals, and it has provided structural details of the membrane interdigitations that would not be possible with conventional electron microscopy. Further, this complementary set of vEM approaches would be applicable to the study of many other structures within other tissue types. *

      • *

      One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

              *We appreciate the reviewer's interest in the fusome, and we agree that our approach has provided significant insight into its three dimensional structure. The rendering of the fusome was performed using a large number of small isosurface volumes, and it is therefore difficult to accurately determine the fusome volume, since additional (non-fusome) material could be included in the model. Further, the fusomes that were rendered were within the germline clusters from region 2b, where the fusome has already started to break down, so these would not provide an accurate quantification of the full fusome volume. Because the focus of the manuscript is on the germline ring canals and associated structures such as the interdigitations (which we have tried to further streamline in this revised version), we believe that additional analysis of the fusome is outside of the scope of this work. *
      

      MINOR COMMENT • The fluorescent markers used in the fly stocks are neither described in the Materials and Methods section nor depicted in the figures.

      *We apologize if this was not clear in the original manuscript. Based on the comment from Reviewer #3 (see below), we have repeated the Hts staining using flies that do not have CheerioYFP in the background. We have also clarified the materials and methods section to indicate the panels that correspond with each strain used. *

      • The authors should quote (Nashchekin et al., Science, 2021) when mentioning unequal partionning of the fusome (p4) and oocyte determination (p12). *We have added the reference to these parts of the manuscript. *

      • P11-12, when mentioning electron dense regions reflecting strong cell-cell adhesion, the authors could refer to (Fichelson et al. Development, 2010), where AJ have been described around ring canals. *We have added the reference to this part of the manuscript. *

      • Figure 2A: The schematic diagram (4th line) is not explained in the figure legend. *We have updated the figure legend to describe this schematic. *

      • Figure 2D: Please clarify whether the RC stage shown corresponds to stage 1 or stage 10, as indicated in panel 2E. Alternatively, are these examples representing the minimum and maximum RC sizes observed across the entire dataset?. *These were not meant to be examples of the minimum and maximum ring canal sizes observed across the dataset. Instead, they were used to demonstrate the significant expansion that occurs during oogenesis. In the updated version of this figure, this panel has been removed. *

      • Figure 5D: Please specify which panel in 5B this corresponds to. • Figure 5E: Please specify which panels in 5B this corresponds to. The two green boxes are not defined. Why is there a grey background under the ovariole assembly? • Figures 5G, 5H: Does panel 5G correspond to the left green box in 5E, and 5H to the right green box in 5E? Please clarify. *We have modified Figure 5 and merged it with the figure 6. In this updated format, panels 5B and 5E have been removed. *

      • Figure 6: The figure title is not on the same page as the figure itself.

      • We have made this change. *

      • Figure 6A: The black box marking the germarium is not defined. *In this revised version, we have modified Fig. 6, and this panel has been removed. *

      • Figure 6B-E: The arrows point to long interdigitations. However, arrowheads (which are not mentioned in the legend) appear to indicate the RC outer rim. Please specify this clearly in the figure legend. In the updated version of Fig. 6, these arrowheads have been removed.

      Reviewer #2 (Significance (Required)):

      I am not an expert in electron microscopy, so I cannot comment in detail on these techniques, but they appear to bridge the gap between conventional EM and optical microscopy in terms of resolution, user-friendliness, and other aspects. This is technically interesting, although these EM approaches have been previously described and applied. The images and movies are beautiful and clearly presented. My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy.

      One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Kolotuev et al. used two volume-based electron microscopy based approaches to identify, segment, and document the changes in intercellular bridges, or ring canals, in early egg chambers of the fruit fly, Drosophila melanogaster. Using array tomography and focused ion beam scanning electron microscopy, Kolotuev et al., provide a high resolution and content rich lineage analysis of ring canal size, shape and orientation among early and late egg chambers. Their analysis included parameters such as the presence and shape of the fusome, the recruitment of actin to the inner ring, and development of membrane fingers that presumably spatially stabilize such structures. Last, Kolotuev and co-authors highlight additional aspects of their dataset including a reconstruction of the border cell cluster in stage 9 egg chambers. The data presented are a treasure trove of the ultrastructural features of the developing dipteran germline and subsequent ovarian follicle development. The data presented represent the highest resolution 3D dataset available and thus are a valuable worthwhile contribution to the field. My overall impression is that this paper sits intellectually between a valuable method and a loose experimental manuscript. This critique is not requesting additional experimental evidence because the data are unique and are the foundation for a new experimental paradigm. But there is not sufficient detail presented to be a full method, nor any hypothesis testing to be considered experimental. I suggest the authors consider amplifying their methods in detail and then note that using these methods provide a foundation for additional future investigations (as mentioned in the discussion). Problems with data interpretation and presentation should be addressed before publication. Below are the major and minor concerns that I believe need to be considered.

      Major comments: In general images in figures are thought provoking, however changes to figure layout and design should be considered to better highlight the results. For instance, I don't know how to follow figure 1a. The arrow leads from a whole ovary to an ovulated egg with an ovariole strand connecting the two. What is the purpose of the arrow? Is it to represent time? And why is the mature egg in the figure when no data regarding this stage is presented. The authors should consider removing the mature egg and helping the reader understand that the ovariole is a subset of the whole ovary. They might do this by putting a box around a single ovarile in the whole ovary to indicate their ovariole illustration. Several other figures have similar problems. Throughout the authors used black and white arrows on black and white EM data and these arrows were lost. Color should be considered to effectively point out what they want the reader to see.

      We have modified the layout of Fig. 1 and added additional explanation to the introduction and figure legend to guide readers through the introduction to the system. We have also added color to some of the arrows throughout the manuscript.

      Can the authors provide additional information for the genotypes used? For instance the Cherrio-YFP (which might affect actin). When what this used and can the authors provide information on how this affected the data between when it was used and when it was not used. Additionally, why was analysis done in transgenic flies over fully wild-type?

      *We have repeated the Hts staining in Fig. 2A in flies that do not express Cheerio-YFP and have made the appropriate changes to the methods section. For the AT-SEM experiment, we chose to use this genetic background since it would align with that of the negative controls that we often use in RNAi or over-expression experiments. FIB-SEM datasets were collected while imaging other tissues of the fly, so the choice of that genotype was not intentional. However, these datasets provided us with the opportunity to do this proof-of-concept work without such a large financial investment in the acquisition of new image stacks. In the future, we hope to expand this work to generate additional datasets from flies of different genotypes. *

      Figure 1 seeks to lay out the ovary system and narrow the reader into the stages that will be analyzed in subsequent figures. Figure 1B is meant to show the types and kinds of electron microscopy, however lacks a full detailed description and legend for each of the colored arrows. And to that fact, so does figure S1. The authors need to provide additional information so the reader can glean what the authors point they are trying to convey. In addition, the authors might add pros and cons to each. I know this was attempted in S1, but did not fully come across.

      We appreciate this feedback, and we have modified the layout of Figure 1 and updated Figures S1 to better highlight the technical challenge of EM in general and benefits of vEM in particular.

      Figure 1 and 2 seek to set up both the biological and technical system to be understood. The authors might consider combining the two figures and eliminate elements that don't represent a result of any kind (Figure 1B, 2B, 3D and 3F). Or more fully explain the result and point they are trying to make with these illustrations. I fully understand and appreciate what they are trying to get across, but it does not come across clearly. For example, I don't know how figure 2B effectively gets across the point that rotation of the image has an effect on how it is sliced and segmented in EM data. Not sure it is necessary. Furthermore, what is the bottom panel with a green ring canal supposed to allow us to interpret or conclude? The same for 3D and F. The result in 3E is far more interesting and should be two panels that emphasize the growth characteristics between young and old rings or those of M1 and M4.

              *We greatly appreciate these suggestions, and we have modified and reorganized several figures to make the flow of scientific ideas easier to follow.* *We have moved panel 1B to the supplementary figure and gave additional indications in the text as to the differences between the EM methods. We have moved panel 2B to the supplementary material. We have moved Fig. 3D to Fig. S5A,B. Fig. 5 now provides more extensive rendering of membrane interdigitations from the stage 4 egg chamber. We have chosen to leave Fig. 3F to allow readers to compare the novel ER-like structures within the ring canals to the fusome that is present within younger germline clusters. *
      

      The HTS and actin stain in figure 2A overlap significantly and obscure the fusome staining. Can the authors confirm that there is no bleed through in their staining and imaging procedure?

      *We have repeated this staining and can confirm that there was no bleed through between the two channels. *

      The data in Figure 2C are critical to showing the z-resolution enhancement of sectioned EM. However, the use of green psuedocolor only in one panel is confusing. Can the authors duplicate the whole panel and provide one without and one with psuedocolor? This would be ideal for fully orienting the reader to the sectioning and setting them up to understand the rest of the figures.

      *In the revised version of Figure 2, we have split the sections into two rows of panels; we have added the pseudocolor to every other section (in the bottom row of panels). *

      • *

      The results section for figure 2 does outline the results presented. For example, the germarium contains syncytia of differing stages and ring canals with intervening fusomes... It does more to talk about the pros and cons of different technical aspects and their difficulty This should be saved for the rationale or the discussion. Rather the section should outline the results presented.

      *We have modified the layout of figure 2 in order to describe the system in a more straightforward manner with a smoother transition from Figure 1 while further explaining technical points. *

      I appreciate the color coding of the differentially segment cysts in Figure 3. The color coding helped orient me to which cysts were being evaluated. However I found the lack of detail bothersome. For instance, which ring canals are in the two panels of D? Are they M1 or M4?

      *With the additional analysis of the interdigitations in the stage 4 cluster, we have moved panel D to Fig. S5. We did not have enough coverage of the region 2a cluster (red) to determine lineage, but we have added a statement to the legend to indicate that the ring canal shown in Fig. S5B is an M1 ring canal. *

      Also, the presentation of ring canal size and distribution should be presented in a graph. Statistics are not necessary, but a dot-plot would go a long way to presenting the result. Two plots can add value, one in which the ring canals for each phase is shown, and the other is the distribution of sizes for each cyst.

      *We have added these graphs in Fig. 4B, C. *

      Lastly, the results section for figure 3 interprets the membrane bound vesicles in the ring canal as "ER-like". This should be removed since they neither look ER-like to me, nor have been shown to be ER in the data.

      *We appreciate this suggestion, and although we cannot be absolutely certain of the identity of these structures without further study, with our additional analysis of the stage 4 egg chamber, we are further convinced of the similar appearance of these novel structures and the ER in other regions of the nurse cell (Fig. 5). We have clarified this point in the text. *

      Figure 4A is not called out specifically in the results and thus should be interpreted or removed from the figure.

      In this revised version, we have removed panel 4A.

      Figure 5 was confusing. I understand the authors wanted to show the wafer and the ribbons, however, this is not a result and does not offer any interpretation of a result and is thus confusing on why it is in the figure. If this were a method paper, I would understand its presence.

      *We have removed this panel from the figure. *

      Can the authors comment on the shape of the nuclei in older egg chambers? They are not round at all. I am interested in whether this is a fixation artifact or the real ultrastructure of the nuclei. Of the border cell nuclei for instance. If it is an artifact, this should be added to the discussion.

      *Some of the nuclei appear to have a peculiar shape in the cross-section. We cannot entirely exclude the role of the fixation in the shape irregularities. However, since not all the nuclei are subject to this phenomenon, we are inclined to attribute it to the intrinsic qualities of the late-stage nuclei. In numerous cases, different tissue and cell stages determine the shape of the nucleus, which frequently deviates from a spherical shape. *

      Although data from "imperfect" samples is interesting, consider relegating Figure 6 to the supplement section, as it takes away from the pre-existing narrative flow established in the paper.

      • In this draft, we have combined parts of figures 5 and 6, and much of the data from the imperfect sample has been removed. *

      Interpretation of the data throughout the results should be left to the discussion section. For instance, interpretation of Figure 4 results on page 14 beginning with "these data demonstrate the importance...". The importance is not related to the result, but rather discussion of past and future studies.

      We have removed this sentence from the results.

      In another example, Figure 5I is introduced and discussed in the results section on page 15, second whole paragraph with an overall introduction/discussion on junctions, which convolutes the actual result. Discussion of future studies or how structures like the novel membrane fingers should be viewed in a larger biological context, should not be in the results.

      We have made this change.

      Minor comments: Remove words such as "pseudo-timelapse", they invoke precision on a point that is imprecise.

      *This has been removed. *

      Re-consider the acronyms for ring canal and egg chamber.

      *We have removed these acronyms. *

      Consider finding another way to call out each supplemental movie other than with another acronym.

      *We have added small icons to indicate that a supplemental movie is associated with a given figure or panel. *

      Reviewer #3 (Significance (Required)): The present manuscript is a technical advance in the field. The use of serial EM imaging with two separate modalities, on what is considered to be a challenging problem in the field, represents a useful technical advance. Light microscopy has thus far limited the resolution to which we can understand the spatial organization and the cellular features there in that regulate germline development. This manuscript brings to bear two serial EM methods to begin approaching this problem. The audience for this work are those working at the forefront of understanding germline architecture and development. I make these statements as an expert in live and super resolution of fruit fly egg chamber development, in addition to having performed 3D SEM in past works.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Kolotuev et al. used two volume-based electron microscopy based approaches to identify, segment, and document the changes in intercellular bridges, or ring canals, in early egg chambers of the fruit fly, Drosophila melanogaster. Using array tomography and focused ion beam scanning electron microscopy, Kolotuev et al., provide a high resolution and content rich lineage analysis of ring canal size, shape and orientation among early and late egg chambers. Their analysis included parameters such as the presence and shape of the fusome, the recruitment of actin to the inner ring, and development of membrane fingers that presumably spatially stabilize such structures. Last, Kolotuev and co-authors highlight additional aspects of their dataset including a reconstruction of the border cell cluster in stage 9 egg chambers. The data presented are a treasure trove of the ultrastructural features of the developing dipteran germline and subsequent ovarian follicle development. The data presented represent the highest resolution 3D dataset available and thus are a valuable worthwhile contribution to the field. My overall impression is that this paper sits intellectually between a valuable method and a loose experimental manuscript. This critique is not requesting additional experimental evidence because the data are unique and are the foundation for a new experimental paradigm. But there is not sufficient detail presented to be a full method, nor any hypothesis testing to be considered experimental. I suggest the authors consider amplifying their methods in detail and then note that using these methods provide a foundation for additional future investigations (as mentioned in the discussion). Problems with data interpretation and presentation should be addressed before publication. Below are the major and minor concerns that I believe need to be considered.

      Major comments:

      • In general images in figures are thought provoking, however changes to figure layout and design should be considered to better highlight the results. For instance, I don't know how to follow figure 1a. The arrow leads from a whole ovary to an ovulated egg with an ovariole strand connecting the two. What is the purpose of the arrow? Is it to represent time? And why is the mature egg in the figure when no data regarding this stage is presented. The authors should consider removing the mature egg and helping the reader understand that the ovariole is a subset of the whole ovary. They might do this by putting a box around a single ovarile in the whole ovary to indicate their ovariole illustration. Several other figures have similar problems. Throughout the authors used black and white arrows on black and white EM data and these arrows were lost. Color should be considered to effectively point out what they want the reader to see.

      • Can the authors provide additional information for the genotypes used? For instance the Cherrio-YFP (which might affect actin). When what this used and can the authors provide information on how this affected the data between when it was used and when it was not used. Additionally, why was analysis done in transgenic flies over fully wild-type? Figure 1 seeks to lay out the ovary system and narrow the reader into the stages that will be analyzed in subsequent figures. Figure 1B is meant to show the types and kinds of electron microscopy, however lacks a full detailed description and legend for each of the colored arrows. And to that fact, so does figure S1. The authors need to provide additional information so the reader can glean what the authors point they are trying to convey. In addition, the authors might add pros and cons to each. I know this was attempted in S1, but did not fully come across. Figure 1 and 2 seek to set up both the biological and technical system to be understood. The authors might consider combining the two figures and eliminate elements that don't represent a result of any kind (Figure 1B, 2B, 3D and 3F). Or more fully explain the result and point they are trying to make with these illustrations. I fully understand and appreciate what they are trying to get across, but it does not come across clearly. For example, I don't know how figure 2B effectively gets across the point that rotation of the image has an effect on how it is sliced and segmented in EM data. Not sure it is necessary. Furthermore, what is the bottom panel with a green ring canal supposed to allow us to interpret or conclude? The same for 3D and F. The result in 3E is far more interesting and should be two panels that emphasize the growth characteristics between young and old rings or those of M1 and M4.

      • The HTS and actin stain in figure 2A overlap significantly and obscure the fusome staining. Can the authors confirm that there is no bleed through in their staining and imaging procedure?

      • The data in Figure 2C are critical to showing the z-resolution enhancement of sectioned EM. However, the use of green psuedocolor only in one panel is confusing. Can the authors duplicate the whole panel and provide one without and one with psuedocolor? This would be ideal for fully orienting the reader to the sectioning and setting them up to understand the rest of the figures.

      • The results section for figure 2 does outline the results presented. For example, the germarium contains syncytia of differing stages and ring canals with intervening fusomes... It does more to talk about the pros and cons of different technical aspects and their difficulty This should be saved for the rationale or the discussion. Rather the section should outline the results presented.

      • I appreciate the color coding of the differentially segment cysts in Figure 3. The color coding helped orient me to which cysts were being evaluated. However I found the lack of detail bothersome. For instance, which ring canals are in the two panels of D? Are they M1 or M4? Also, the presentation of ring canal size and distribution should be presented in a graph. Statistics are not necessary, but a dot-plot would go a long way to presenting the result. Two plots can add value, one in which the ring canals for each phase is shown, and the other is the distribution of sizes for each cyst. Lastly, the results section for figure 3 interprets the membrane bound vesicles in the ring canal as "ER-like". This should be removed since they neither look ER-like to me, nor have been shown to be ER in the data.

      • Figure 4A is not called out specifically in the results and thus should be interpreted or removed from the figure.

      • Figure 5 was confusing. I understand the authors wanted to show the wafer and the ribbons, however, this is not a result and does not offer any interpretation of a result and is thus confusing on why it is in the figure. If this were a method paper, I would understand its presence.

      • Can the authors comment on the shape of the nuclei in older egg chambers? They are not round at all. I am interested in whether this is a fixation artifact or the real ultrastructure of the nuclei. Of the border cell nuclei for instance. If it is an artifact, this should be added to the discussion.

      • Although data from "imperfect" samples is interesting, consider relegating Figure 6 to the supplement section, as it takes away from the pre-existing narrative flow established in the paper. Interpretation of the data throughout the results should be left to the discussion section. For instance, interpretation of Figure 4 results on page 14 beginning with "these data demonstrate the importance...". The importance is not related to the result, but rather discussion of past and future studies. In another example, Figure 5I is introduced and discussed in the results section on page 15, second whole paragraph with an overall introduction/discussion on junctions, which convolutes the actual result. Discussion of future studies or how structures like the novel membrane fingers should be viewed in a larger biological context, should not be in the results.

      Minor comments:

      • Remove words such as "pseudo-timelapse", they invoke precision on a point that is imprecise.

      • Re-consider the acronyms for ring canal and egg chamber.

      • Consider finding another way to call out each supplemental movie other than with another acronym.

      Significance

      The present manuscript is a technical advance in the field. The use of serial EM imaging with two separate modalities, on what is considered to be a challenging problem in the field, represents a useful technical advance. Light microscopy has thus far limited the resolution to which we can understand the spatial organization and the cellular features there in that regulate germline development. This manuscript brings to bear two serial EM methods to begin approaching this problem. The audience for this work are those working at the forefront of understanding germline architecture and development. I make these statements as an expert in live and super resolution of fruit fly egg chamber development, in addition to having performed 3D SEM in past works.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study presents a high-resolution volumetric analysis of germline ring canals (RCs) during Drosophila oogenesis. By combining two complementary electron microscopy techniques-Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) and Array Tomography Scanning Electron Microscopy (AT-SEM)-the authors compare RC structural features at different developmental stages, ranging from the relatively small germarium to the significantly larger, later-stage egg chambers. At early stages of oogenesis, FIB-SEM analysis confirms that the average RC size increases progressively with cyst development, in agreement with previous studies. The authors further show that lineage reliably predicts RC size (an observation previously reported, but here identified at an earlier stage in region 2a) and, importantly, that the thickness of the actin rim can also be predicted by lineage (reported here for the first time, at stage 1). FIB-SEM analysis also enables a clear delineation of the fusome, allowing for detailed characterization of its assembly and disassembly. Notably, the authors report, for the first time, structural evidence of ER-like membranes capping the inner rim of actin RCs. At later developmental stages, AT-SEM analysis reveals that the microvilli observed by FIB-SEM evolve into extensive interdigitations extending beyond the outer rim in mid-stage egg chambers, a structural feature detected earlier than previously reported. Moreover, by analyzing a sample in which tissue organization was disrupted during preparation, the authors demonstrate that these interdigitations preferentially occur in proximity to the RC. In addition to RC analysis at later stages, the authors use AT-SEM to readily identify small cell populations, such as the germline stem cell niche and border cells, and provide high-resolution volumetric EM data for these structures.

      MAJOR COMMENT

      My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy. One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

      MINOR COMMENTS

      • The fluorescent markers used in the fly stocks are neither described in the Materials and Methods section nor depicted in the figures.

      • The authors should quote (Nashchekin et al., Science, 2021) when mentioning unequal partionning of the fusome (p4) and oocyte determination (p12).

      • P11-12, when mentioning electron dense regions reflecting strong cell-cell adhesion, the authors could refer to (Fichelson et al. Development, 2010), where AJ have been described around ring canals.

      • Figure 2A: The schematic diagram (4th line) is not explained in the figure legend.

      • Figure 2D: Please clarify whether the RC stage shown corresponds to stage 1 or stage 10, as indicated in panel 2E. Alternatively, are these examples representing the minimum and maximum RC sizes observed across the entire dataset?.

      • Figure 5D: Please specify which panel in 5B this corresponds to.

      • Figure 5E: Please specify which panels in 5B this corresponds to. The two green boxes are not defined. Why is there a grey background under the ovariole assembly?

      • Figures 5G, 5H: Does panel 5G correspond to the left green box in 5E, and 5H to the right green box in 5E? Please clarify.

      • Figure 6: The figure title is not on the same page as the figure itself.

      • Figure 6A: The black box marking the germarium is not defined.

      • Figure 6B-E: The arrows point to long interdigitations. However, arrowheads (which are not mentioned in the legend) appear to indicate the RC outer rim. Please specify this clearly in the figure legend.

      Significance

      I am not an expert in electron microscopy, so I cannot comment in detail on these techniques, but they appear to bridge the gap between conventional EM and optical microscopy in terms of resolution, user-friendliness, and other aspects. This is technically interesting, although these EM approaches have been previously described and applied. The images and movies are beautiful and clearly presented.

      My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy. One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The possibility of observing 3D cellular organisation in tissues at nanometre resolution is a hope for many cell biologists. Here, the authors have combined two volume electron microscopy approaches with scanning electron microscopy: Focused Ion Beam (FIB-SEM) and Array Tomography (AT-SEM) to study the evolution of the shape and organisation of cytoplasmic bridges, the 'ring canals' (RCs) in the Drosophila ovarian follicle that connect nurse cells and oocyte. This type of cytoplasmic link, found in insects and humans, is essential for oocyte development. RCs have mainly been studied using light microscopy with various markers that constitute them, but this approach does not fully capture an overall view of their organization. Due to their three-dimensional arrangement within the ovarian follicle, characterizing their organization using transmission electron microscopy (TEM) has been very limited until now. This v-EM study allows the authors to document the evolution of RC size and thickness during the development of germline cysts, from the germarium to stage 4, and potentially beyond. This study confirmed previous findings, namely that RC size correlates with lineage: the largest RC is formed after the first division, while the smallest is formed during the last division. Furthermore, this work allowed a better characterisation of the membrane interdigitation surrounding the RCs. In addition, the authors highlight the important potential of v-EM for further structural analysis of the fusome, migrating border cells and the stem cell niche.

      Major comments

      • The output of this work can be divided into two parts. First, this work presents a technical challenge, involving image acquisition by volume electron microscopy and manual 3D reconstruction of the contours of the membranes, nuclei, RCs, and fusome in different cysts at different stages. Secondly, this work is based on a structural study of the RCs and their associated membranes. This work is descriptive but important, although the results largely confirm previous findings, both for the structure of the RCs and their relationship to the division sequence of the cyst cells, and for the organisation of the membranes around the RCs.

      • Very interestingly, the authors report the spatial characterisation of membrane structures associated with and close to CRs that have already been identified (Loyer et al.). However, their characterisation is somewhat incomplete, as it lacks quantified data - how many CRs were analysed? and, above all, the characteristics of these membranes, their length and orientation according to their position and their connection in the lineage - these data could be obtained from the VEM data already collected and would be an important addition to the RC structural analysis in this work. In line with this, the authors importantly report the presence of an ER-like membrane structure lining the RCs. First, it would be nice to have statistics to support the observation of how many RCs..? Secondly, does this ER membrane structure vary according to the position of the RC in the cyst, are they related to the RC lineage? The addition of graphs showing the quantitative data with statistics in the figures would improve understanding of the results. This is particularly the case for the characterisation of RCs according to the stage of cyst development, as shown in Figure 3. This also applies to the characterisation of RCs within a cyst and the relationship between RC size and lineage, as shown in Figure 4, and to the characterisation (thickness) of the inner part of the RC.

      • The part on the structural analysis of the fusome is interesting but still secondary to the characterisation of the RCs. This part should be moved to the results and figures after the various parts concerning the RCs.

      Minor comments

      • The distribution of the fusome in Figure 2 is difficult to see with Hts labelling and does not really correspond to the schematic, especially in regions 2a and 2B.

      • In panel C of Figure 2, it is a little disturbing that the legend is directly on the image of RC. It hides some information about the images and could be placed at the bottom of the panel. This also the case for the panel G.

      • With figure 3B, it would be good to highlight the position of cyst.

      Significance

      As mentioned above, this work can be divided into two parts.

      The part corresponding to the acquisition of images by volume electron microscopy and manual 3D reconstruction is new and a great source of valuable information. The part related to the spatial characterisation of the RC is important, but corresponds more to an extension and reinforcement of previously available information than to the contribution of significant new insights.

      I think it will be of great interest to an audience interested in Drosophila oogenesis.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Dufour et al. is a follow-up on the groups' previous publication that introduced the photo-inducible Cre recombinase, LiCre. In the present work, the authors further characterize the properties and kinetics of their optogenetic switch. Initially, the authors show that light affects only LiCre-mediated recombination itself and not DNA binding. Following these observations, they measure and mathematically model LiCre kinetics demonstrating high efficiency in vivo and a surprising temperature sensitivity. Finally, Dufour et al. evaluate several mutations that affect the LOV photo-cycle and provide recommendation for LiCre applications. The study thoroughly investigates various aspects of the function of LiCre, confirming some previously known characteristics (i.e. temperature-dependence of Cre activity and functionality of LOV-based optogenetic tools in yeast without co-factor supplementation), while providing new LiCre-specific insights (kinetics, light-independent DNA binding). Please note that the reviewer is no expert in mathematical modeling and cannot fully judge the methodological details of the models. While I have some concerns as listed below, I believe study should be well-suited for publication after a revision.

      Major comments:

      1. After completing the initial experiment, the authors discovered that their plasmids carry different numbers of V5 epitopes. I am wondering whether this was due to a recombination event happening during the experiment or whether the constructs were not sequence verified prior to use? In any case, an additional ChIP experiment using Cre and LiCre constructs with the identical number of tag-repeats will be necessary. The result, i.e. the strong reduction of DNA-binding of LiCre (which is close to the negative control), is quite remarkable given that LiCre is still considerably active and high DNA affinities were observed in SPR experiments. In light of these counterindications, identical experiment conditions for test and reference group become even more important.
      2. The conclusion that DNA-binding of LiCre is completely light-independent is not entirely convincing to me. The differences between the light and dark conditions in Fig. 2d are indeed small, but the values for LiCre are almost on par with the vector control and therefore hard to interpret. Based on this experiment alone, one could even be inclined to argue that LiCre does not bind DNA at all (which is of course falsified by the later experiments), showing that the resolution of the corresponding dataset is too low to draw final conclusions. Light-independent DNA binding should either be confirmed by a more sensitive method or the conclusion statements on this matter should be revised accordingly.
      3. If I understand the explanations correctly, replicates and plotted data points refer to multiple samples (different colonies), that were handled in a single experiment, i.e. by one researcher at the same time/same day. As already mentioned by the authors in the main text, this workflow explains the considerable differences between some of the results in the present manuscript and an identical experiment in a previous publication by the same authors. Providing truly independent experiments (performed on different days) that are therefore independent towards variables such as the fluctuation in incubation temperature (which was the issue in the described experiments) will be crucial, at least for the key datasets.

      Minor comments:

      1. At the end of the Introduction, the authors mention that the interaction of the Cre heptamers was weakened via point mutations in LiCre. A short sentence about the engineering rationale behind this weakened interaction would help readers, who are not familiar with the author's prior work.
      2. Fig. 2a-b depicts images relating to the purification procedure. These could be moved to the supplements as they don't provide any insight apart from the fact that the proteins were successfully purified.
      3. The kinetic characterization was only performed for LiCre. Especially for scientists, who have worked with wildtype Cre before, a side-by-side comparison with wt Cre would be valuable to judge the loss in reaction speed that has to be expected when switching from Cre to LiCre.
      4. The difference between the ChIP results and the SPR results is striking but not mentioned in the discussion section. Also, the statement: "Finally, our results have practical implications on experimental protocols employing LiCre. First, given its high affinity for loxP (Fig. 5b), over-expressing LiCre at high levels will probably not increase its efficiency." (line 502) refers only to the affinity but seems to ignore the low DNA-occupancy of LiCre observed in Fig. 2d. Adapting the discussion section accordingly would improve the manuscript.

      Significance

      General assessment and advance:

      The present study provides a large set of experiments and analyses characterizing the optogenetic LiCre recombinase. In general, the study is well conceived and executed. Although some of my concerns listed above affect key aspects of the study, they should be straightforward to address. The manuscript is a follow-up study providing a more detailed characterization of an optogenetic tool previously developed by the same authors. Its novelty is therefore somewhat limited. While the study provides a rich body of additional data, many of the findings merely confirmed aspects that were to be expected based on the two proteins LiCre is built of (temperature-dependent activity of Cre, optogenetics in yeast w/o the need of co-factor supplementation, weaker DNA-affinity of the Cre fusion protein as compared to wildtype Cre). New insights are provided by the facts that (i) light only controls recombination but not DNA binding and (ii) light activation of only some protomers within the LiCre heptamer is likely to be sufficient to activate recombination. The former aspect is, however, not entirely evident from the results as described above.

      Audience:

      The study will be of interest for researchers focusing on inducible DNA recombination and especially relevant to those who plan to work with LiCre and can now rely on a more detailed and extended characterization compared to the original LiCre publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: This manuscript presents a detailed kinetic and mechanistic characterization of the optogenetic recombinase LiCre, which enables site-specific DNA recombination upon blue-light stimulation. The authors combine in vitro surface plasmon resonance assays, yeast-based recombination assays, and mathematical modeling to dissect the DNA-binding properties, activation dynamics, and recombination efficiency of LiCre. They demonstrate that LiCre binds DNA even in the absence of light, albeit with reduced cooperativity compared to Cre recombinase. Through kinetic modeling, they propose that activation of only two LiCre units may suffice for recombination. The study also evaluates the impact of point mutations in the LOV domain on LiCre's photocycle. The experimental methods are described in detail. Statistical analyses are appropriate and clearly reported.

      Major Comments:

      1. In Figure 1, control experiments with no loxP sequences (i.e. original strain) should be performed to demonstrate specific binding of Cre/LiCre to loxP sequence.
      2. In Figure 2, the SPR experiments are robust and informative. However, the lack of measurement of DNA binding of light-activated LiCre is a notable gap, which will help understand whether the cooperativity of LiCre can be modulated by light. If it is difficult due to experimental conditions, there is lit-mimetic mutant of LOV2 (https://www.nature.com/articles/nmeth.3926).

      Significance

      General Assessment: This is a rigorous study that combines experimental and computational approaches to advance our understanding of LiCre-based optogenetic genome engineering. The strongest aspects are the integration of SPR data with kinetic modeling and the practical insights into LiCre's performance under various conditions. However, the other limitation is the lack of direct validation of some model predictions.

      Advance: To the best of my knowledge, this is the first study to quantitatively model the activation dynamics of LiCre. The work extends previous findings on LiCre and provides new mechanistic and practical insights.

      Audience: This study will be of interest to specialized audiences, particularly those developing or applying the LiCre system.

      Reviewers' Field of Expertise: Protein engineering, Genome editing, Optogenetics, Cell Biology.

      Limitations of Expertise: I do not have deep expertise in mathematical modeling.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Dufour et al describe characterization of the light-activated recombinase LiCre. This work combines the yeast reporter assay, surface plasmon resonance (SPR) and kinetic modeling to provide a comprehensive study of how LiCre functions both in vivo in yeast and in vitro. The authors show that LiCre binds to loxP sites in the dark with high affinity, but reduced cooperativity compared to wild-type Cre, and that recombination efficiency is affected by temperature and illumination regime. Importantly, the authors establish a kinetic model that not only explains these observations but also predicts the altered behavior of a mutant (T418S), which was experimentally validated. It would be valuable to highlight what other predictions the model could make, even if for future work. Overall, this work combines quantitative experiments and modeling to provide new insights into the biochemical and kinetic properties of LiCre.

      Specific comments:

      Line 110-115: Although described in the Methods section, a brief statement of dark and light treatment conditions would help readers better follow the experiments. Likewise, listing the three unrelated positions would improve the clarity.

      Line 185: Is there a typo?

      Line 216: Have the authors considered performing surface plasmon resonance (SPR) to confirm the binding affinity of LiCre-V5 DNA?

      Line 233-234: To determine whether the observed difference in recombination efficiency is due to the genomic context of the reporter loci or due to the measurement accuracy of GFP and RFP signals, have the authors considered swapping the positions of GFP and RFP?

      Line 236: The sentence "Importantly, we never observed recombination in the entire cell population" is ambiguous. I believe it means recombination was never observed in 100% of the cells. Please rephrase it.

      Line 245-249: The hypothesis of plasmid loss based on plating samples on selective and non-selective media without illumination assumes that loss of growth on selective media is only due to plasmid loss, without considering other factors like burden or toxicity. Moreover, the broad range of 10-30% makes it difficult to justify that the ~15% recombination-negative fraction falls within expected variation. The conclusion that LiCre-mediated recombination efficiency is close to 100% after prolonged photoactivation (Line 249, 301-303) is not fully convincing unless more evidence is provided.

      Line 275-276: The authors suspect that the decrease in recombination efficiency at very high light intensity is possibly attributed to phototoxicity. Could photobleaching also contribute to this effect? A viability assay would help to validate the phototoxicity explanation.

      Line 345-346: While the model with x=2 provides a slightly better fit comparing to the others, the possibility of x=4 cannot be excluded. The inference that "photo-activation of at least two LiCre protomers enables recombination" is not sufficiently proven.

      Figure 1e: Please clarify whether the Western blots shown represent biological replicates.

      Figure 4: Please include the error bars. Panel a - The authors integrated GFP and mCherry reporters at two different loci to avoid positional bias. Why then is only mCherry used as the ON readout in most experiments, rather than analyzing both reporters in parallel? Please clarify. For panel 4h and line 272, the statement that maximal activation was reached at 12 mW/cm² should be rephrased more cautiously, as no intermediate intensities between 12 and 35.6 mW/cm² were tested.

      Significance

      This study provides a quantitative experimental and predictive analysis of the light-activated recombinase LiCre, offering new insights into its binding, activation and recombination properties. The predictive validation of the mutant is a strength of this work. While the modeling part is an innovative aspect, more clarification is needed, especially regarding the conclusion that photo-activation of at least two LiCre protomers enables recombination. More mechanistic investigations are needed to support the conclusions. The work will be of interest to researchers in optogenetics, genome engineering, and DNA-protein interactions. My expertise is in yeast genome engineering and applications of Cre-mediated recombination system. Modeling is outside my primary area of expertise.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      Thank you for providing an assessment of our manuscript. We suggest here a revision plan to address the points raised by the reviewers regarding code documentation, benchmarking, and biological applications.

      As part of the revisions implemented we have:

      Clarified the management of dependencies of our package Fixed the data download run times of test data Clarified the parameters of the normalization and optimization functions We plan to:

      Extend our manuscript to include a section on cross-condition analysis that builds on our tutorials, where we will illustrate how ParTIpy can quantify shifts in the distribution of fibroblasts across the functional space defined by archetypal analysis between healthy and failing hearts. Extend our benchmarks of scalability of coresets, by reporting wall-clock time and peak memory usage across distinct data sizes. Extend our benchmarks of stability of coresets, by reporting the similarity of the estimated archetypes based on the original versus the sampled data. Include the original enrichment analysis of ParTI to provide users with distinct options to work with the archetypes, and provide a larger discussion on the distinct strategies. We believe these revisions will strengthen our__ software manuscript__ and will help us to provide a robust and practical tool to analyze functional trade-offs from biological data.

      2. Description of the planned revisions

      Reviewer #1

      Summary

      The paper "ParTIpy: A Scalable Framework for Archetypal Analysis and Pareto Task Inference" presents ParTIpy, an open-source Python package that modernizes and scales the Pareto Task Inference (ParTI) framework for analyzing biological trade-offs and functional specialization. Unlike the earlier MATLAB implementation, which required a commercial license and was limited in scalability, ParTIpy leverages Python's open ecosystem and integration with tools such as scverse to make archetypal analysis more accessible, flexible, and compatible with modern biological data workflows. Through advanced optimization and coreset algorithms, it efficiently handles large scale single cell and spatial transcriptomics datasets. ParTIpy identifies "archetypes", or optimal phenotypic extremes, to reveal how cells balance competing functional programs. The paper demonstrates its application in modeling hepatocyte specialization across the liver lobule, highlighting spatial patterns of metabolic division of labor.

      Overall, ParTIpy represents a modern, accessible, and scalable Python-based solution for exploring biological trade-offs and resource allocation in high-dimensional data. The paper is clearly written and addresses an important methodological gap. However, the enrichment analysis differs from the original ParTI framework and should be discussed more explicitly, and the documentation and tutorials, while helpful, could be refined to improve usability and reproducibility.

      Major Comments

      1. The archetype enrichment analysis used in this paper differs from the original enrichment analysis implemented in ParTI. This is acceptable, but: a) The authors should explicitly state and discuss the differences between the two approaches. b) The enrichment analysis should be made more systematic. For each tested feature (e.g. gene or pathway), the analysis should report a p-value for the hypothesis that the feature is enriched near an archetype - that is, its expression (or value) is high close to the archetype and decreases with distance. Appropriate multiple-hypothesis correction should also be applied.

      We thank the reviewer for this valuable comment and agree that the differences between our enrichment analysis and the original ParTI implementation should be stated more explicitly. We will incorporate the original enrichment algorithm into ParTIpy, enabling users to select their preferred method. In the revised manuscript, we will note that two enrichment algorithms are available and describe both in greater detail in the supplementary methods section. We also note that the current enrichment analysis already reports p-values adjusted for multiple hypothesis testing.

      Reviewer #2

      Summary

      This paper introduces the software ParTIpy, a scalable Python implementation of Pareto Task Inference (ParTI), designed to infer functional trade-offs in biological systems through archetypal analysis. The framework modernizes the previous toolbox with efficient optimization, memory-saving coreset construction, and integration with the scverse ecosystem for single-cell transcriptomic data.

      Using hepatocytes scRNA-seq data as a test case, the authors identify archetypes corresponding to distinct gene expression patterns. These archetypes align with known liver domains in spatial transcriptomics data, validating both the method's interpretability and its biological relevance.

      Major comments

      (1) Conclusions

      The core computational and biological claims are well supported. ParTIpy clearly scales better than earlier implementations and reproduces known biological structure. However, claims about "scalability to large datasets" should be further qualified (see below).

      We will implement further performance benchmarks as discussed below.

      (2) Claims

      Archetypal analysis based on current matrix computation formulation is non-parametric, and new data require recomputation of archetypes. Therefore, the method cannot generalize to unseen data in the way deep learning approaches, which could be further acknowledged and clarified.

      We thank the reviewer for this insightful comment. We agree that deep learning frameworks are typically amortized, allowing them to generalize to unseen data without retraining, and we will clarify this distinction in the discussion of the revised manuscript. However, we note that mapping new cells into an existing archetypal space is computationally inexpensive, as it only requires solving a single convex optimization problem.

      (3) Additional suggested analyses or experiments

      1) Absolute performance benchmarks : it's suggested to report wall-clock time and memory for a few dataset sizes (10k, 100k, 1M cells).

      We thank the reviewer for this helpful suggestion. We will extend the coreset benchmark to quantify how coreset size affects both archetype positions and biological interpretation. Specifically, we will match archetypes across coreset sizes by solving the linear sum assignment problem, as we currently do when comparing bootstrap samples. We will then compare the distances between archetypes inferred from the full dataset and those obtained from different coreset sizes. In addition to measuring displacement, we will assess biological stability by comparing the gene expression vectors of corresponding archetypes as well as their enriched pathways (using metrics such as cosine similarity and Jaccard index).

      **Referee cross-commenting**

      I agree with the other reviewer's suggestion to check consistency and reproducibility with previous implementation, and enhance the tutorial of the software for users from a biological background. Combined with my comments to further improve the biological application showcase, the revised manuscript could be an impactful contribution to the field, if these comments could be properly addressed.

      (1) Advance

      This paper is primarily a technical contribution. It modernizes the Pareto Task Inference framework into a scalable and user-friendly Python implementation, which is valuable. However, to further improve its significance especially for the broader biological audience, more detailed analysis could be performed (see below)

      (2) Biological scope and applications [optional]

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      1) Cross-condition comparisons: could ParTIpy quantify how the Pareto front shifts between conditions (e.g., normal vs. tumor, treated vs. control)?

      We thank the reviewer for this valuable suggestion. We have shown ParTIpy's applicability to cross-condition settings in our online tutorials (https://partipy.readthedocs.io/en/latest/notebooks/cross_condition_lupus.html). However, we agree that a more explicit mention in the manuscript is needed. Thus, we will include a cross-condition analysis as a second application in the revised manuscript, focusing on fibroblasts from heart failure patients from Amrute, et. al. (2023) 1. This will illustrate how ParTIpy can quantify shifts in the distribution of cells across the functional space defined by archetypal analysis.

      Because the manuscript does not explore these scenarios, the biological impact remains narrow, and the framework's broader interpretive power is somehow underrepresented.

      We hope that the additional application included in the revised manuscript helps better illustrate the framework's strength. We would also like to note that the online tutorials provide a comprehensive overview of ParTIpy's functionality, as we expect these will serve as a primary entry point for many researchers interested in archetypal analysis and Pareto Task Inference.

      (3) Audience and impact

      The paper will interest computational biologists, systems biologists, and bioinformaticians focused on single-cell analysis, and its impact will grow substantially if the authors demonstrate more biological applications.

      (4) Reviewer expertise

      Computational biology, single-cell transcriptomics, machine learning, computational math

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      2. The package documentation on GitHub and ReadTheDocs is a major strength, but the tutorials can be improved for clarity and accessibility:

      We thank the reviewer for this positive feedback. Indeed, providing comprehensive documentation to facilitate ease of adoption was a major motivation behind this project. In response to the reviewer's suggestions, we have revised the tutorials to further improve their clarity, structure, and accessibility, as detailed below.

      a) The documentation should list external dependencies that need to be installed seperately, e.g. pybiomart.

      We thank the reviewer for pointing this out. We had added all dependencies under the optional-dependencies.extra header, which allows users to run pip install partipy[extra] to be able to run all tutorial notebooks. However, we forgot to explain that in the tutorial or Readme page, which we corrected now. The Readme now reads:

      Install the latest stable full release from PyPI with the extra dependencies (e.g., pybiomart, squidpy, liana) that are required to run every tutorial:

      ``` pip install partipy[extra]

      ```

      Additionally we include clarifications in every tutorial notebook that uses additional dependencies: "To run this notebook, install ParTIpy with the tutorial extras: pip install partipy[extra]".

      b) The dataset used in the Quickstart demo appears to be inaccessible or extremely slow to download (the function load_hepatocyte_data_2() did not complete even after 30 minutes, at least in my experience). The authors should verify data availability on Zenodo and consider providing a smaller or cached version to make the demo more reliable and reproducible.

      We thank the reviewer for this helpful comment. We agree that the previous implementation of load_hepatocyte_data_2() was not reliable due to slow download speeds from Zenodo. To address this, we now host the required AnnData object on figshare (https://figshare.com/articles/dataset/scRNA-seq_hepatocyte_data_from_Ben-Moshe_et_al_2022_/30588713?file=59459459), ensuring faster and more stable access for the Quickstart tutorial via scanpy.read:

      ```

      adata = sc.read("data/hepatocyte_processed.h5ad", backup_url="https://figshare.com/ndownloader/files/59459459")

      adata

      ```

      c) The tutorial order could be more intuitive - for instance, "archetype crosstalk network" appears before "archetypal analysis". Consider starting with the simulated dataset and presenting the full pipeline before moving to more complex real-world examples.

      We thank the reviewer for this helpful suggestion and agree that the previous ordering was not intuitive. We have reordered the tutorials such that the notebook introducing archetypal analysis now appears first, followed by the Quickstart tutorial and the subsequent applied examples.

      Minor comments

      1. In the Python function, the parameter "optim" could use more descriptive option names - for example, renaming "projected_gradients" to "PCHA" would make it clearer and more consistent with terminology used in the paper.

      We thank the reviewer for this helpful suggestion. We agree that the previous naming could be misleading. While PCHA does not precisely describe the underlying algorithm, it is the term most users are familiar with from the literature. We have therefore updated the function to accept both "PCHA" and "projected_gradients", which now map to the same underlying optimization routine.

      In the Quickstart preprocessing, the authors use the following code:

      sc.pp.normalize_total(adata)

      sc.pp.log1p(adata)

      However, they do not specify the target sum in the normalize_total function. The authors should ensure that the data values before the logarithmic transformation span several orders of magnitude (e.g., 0-10,000); if normalization is performed to a sum of 1, the log transformation becomes ineffective.

      We thank the reviewer for this helpful comment. By default, sc.pp.normalize_total scales the counts in each cell to the median total counts across all cells, which preserves the typical range of expression values prior to logarithmic transformation. We therefore consider this default behavior appropriate for the Quickstart example. Nonetheless, we will clarify this explicitly in the tutorial to avoid confusion.

      **Referee cross-commenting**

      I agree with Reviewer #2 observation that the paper's contribution is primarily technical; however, I consider this technical advance to be an important and timely one that will enable many biologists to apply archetypal analysis more effectively in their own work.

      We thank the reviewer for this positive and encouraging assessment.

      Reviewer #1 (Significance (Required)):

      This study presents ParTIpy, a Python-based implementation of Pareto Task Inference (ParTI) that makes archetypal analysis more accessible, scalable, and compatible with modern single-cell and spatial transcriptomics workflows. Its main strength lies in translating a conceptually powerful but technically limited MATLAB framework into an open-source, efficient Python package, enabling wider use in computational biology. The package is well-documented, which further enhances its accessibility and adoption potential, though documentation could be improved to enhance reproducibility and ease of use. It will be of interest to computational systems biologists, particularly those working with omics data, and those interested in studying functional trade-offs and resource allocation.

      We appreciate the reviewer's positive evaluation and are encouraged by their recognition of ParTIpy's relevance and potential impact in computational biology.

      4. Description of analyses that authors prefer not to carry out

      Reviewer #2

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      2) Transient or plastic states: Cells with mixed archetype weights or high mixture entropy can be interpreted as transient, functionally flexible states. ParTIpy can quantify such transience geometrically, even in static data, which providing a competitive counterpart to models like CellRank or CellSimplex (https://doi.org/10.1093/bioinformatics/btaf119).

      We thank the reviewer for this interesting suggestion. While we agree that quantifying transient or plastic states based on archetype mixtures is an intriguing idea, validating whether cells with mixed archetype weights ("generalists") truly represent transient states would require additional data modalities such as temporal or lineage-tracing measurements. Although we find this direction highly interesting, given that the manuscript is intended as a software paper, we prefer to focus on more directly supported applications of cross-condition data, where labeled data is available.

      However, we will expand our discussion to relate ParTIpy with CellSimplex since we believe this is an interesting angle that future users could explore.

      5. References

      1. Amrute, J. M. et al. Defining cardiac functional recovery in end-stage heart failure at single-cell resolution. Nat. Cardiovasc. Res. 2, 399-416 (2023).
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This paper introduces the software ParTIpy, a scalable Python implementation of Pareto Task Inference (ParTI), designed to infer functional trade-offs in biological systems through archetypal analysis. The framework modernizes the previous toolbox with efficient optimization, memory-saving coreset construction, and integration with the scverse ecosystem for single-cell transcriptomic data.

      Using hepatocytes scRNA-seq data as a test case, the authors identify archetypes corresponding to distinct gene expression patterns. These archetypes align with known liver domains in spatial transcriptomics data, validating both the method's interpretability and its biological relevance.

      Major comments

      (1) Conclusions

      The core computational and biological claims are well supported. ParTIpy clearly scales better than earlier implementations and reproduces known biological structure. However, claims about "scalability to large datasets" should be further qualified (see below).

      (2) Claims

      Archetypal analysis based on current matrix computation formulation is non-parametric, and new data require recomputation of archetypes. Therefore, the method cannot generalize to unseen data in the way deep learning approaches, which could be further acknowledged and clarified.

      (3) Additional suggested analyses or experiments

      1. Absolute performance benchmarks : it's suggested to report wall-clock time and memory for a few dataset sizes (10k, 100k, 1M cells).
      2. Coreset sensitivity analysis: Could authors show how coreset size affects archetype positions and biological interpretation?

      Referee cross-commenting

      I agree with the other reviewer's suggestion to check consistency and reproducibility with previous implementation, and enhance the tutorial of the software for users from a biological background. Combined with my comments to further improve the biological application showcase, the revised manuscript could be an impactful contribution to the field, if these comments could be properly addressed.

      Significance

      (1) Advance

      This paper is primarily a technical contribution. It modernizes the Pareto Task Inference framework into a scalable and user-friendly Python implementation, which is valuable. However, to further improve its significance especially for the broader biological audience, more detailed analysis could be performed (see below)

      (2) Biological scope and applications [optional]

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      1) Cross-condition comparisons: could ParTIpy quantify how the Pareto front shifts between conditions (e.g., normal vs. tumor, treated vs. control)?

      2) Transient or plastic states: Cells with mixed archetype weights or high mixture entropy can be interpreted as transient, functionally flexible states. ParTIpy can quantify such transience geometrically, even in static data, which providing a competitive counterpart to models like CellRank or CellSimplex (https://doi.org/10.1093/bioinformatics/btaf119).

      Because the manuscript does not explore these scenarios, the biological impact remains narrow, and the framework's broader interpretive power is somehow underrepresented.

      (3) Audience and impact

      The paper will interest computational biologists, systems biologists, and bioinformaticians focused on single-cell analysis, and its impact will grow substantially if the authors demonstrate more biological applications.

      (4) Reviewer expertise Computational biology, single-cell transcriptomics, machine learning, computational math

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The paper "ParTIpy: A Scalable Framework for Archetypal Analysis and Pareto Task Inference" presents ParTIpy, an open-source Python package that modernizes and scales the Pareto Task Inference (ParTI) framework for analyzing biological trade-offs and functional specialization. Unlike the earlier MATLAB implementation, which required a commercial license and was limited in scalability, ParTIpy leverages Python's open ecosystem and integration with tools such as scverse to make archetypal analysis more accessible, flexible, and compatible with modern biological data workflows. Through advanced optimization and coreset algorithms, it efficiently handles large scale single cell and spatial transcriptomics datasets. ParTIpy identifies "archetypes", or optimal phenotypic extremes, to reveal how cells balance competing functional programs. The paper demonstrates its application in modeling hepatocyte specialization across the liver lobule, highlighting spatial patterns of metabolic division of labor. Overall, ParTIpy represents a modern, accessible, and scalable Python-based solution for exploring biological trade-offs and resource allocation in high-dimensional data. The paper is clearly written and addresses an important methodological gap. However, the enrichment analysis differs from the original ParTI framework and should be discussed more explicitly, and the documentation and tutorials, while helpful, could be refined to improve usability and reproducibility.

      Major Comments

      1. The archetype enrichment analysis used in this paper differs from the original enrichment analysis implemented in ParTI. This is acceptable, but:

      a. The authors should explicitly state and discuss the differences between the two approaches.

      b. The enrichment analysis should be made more systematic. For each tested feature (e.g. gene or pathway), the analysis should report a p-value for the hypothesis that the feature is enriched near an archetype - that is, its expression (or value) is high close to the archetype and decreases with distance. Appropriate multiple-hypothesis correction should also be applied. 2. The package documentation on GitHub and ReadTheDocs is a major strength, but the tutorials can be improved for clarity and accessibility:

      a. The documentation should list external dependencies that need to be installed seperately, e.g. pybiomart.

      b. The dataset used in the Quickstart demo appears to be inaccessible or extremely slow to download (the function load_hepatocyte_data_2() did not complete even after 30 minutes, at least in my experience). The authors should verify data availability on Zenodo and consider providing a smaller or cached version to make the demo more reliable and reproducible.

      c. The tutorial order could be more intuitive - for instance, "archetype crosstalk network" appears before "archetypal analysis". Consider starting with the simulated dataset and presenting the full pipeline before moving to more complex real-world examples.

      Minor comments

      1. In the Python function, the parameter "optim" could use more descriptive option names - for example, renaming "projected_gradients" to "PCHA" would make it clearer and more consistent with terminology used in the paper.
      2. In the Quickstart preprocessing, the authors use the following code: sc.pp.normalize_total(adata) sc.pp.log1p(adata) However, they do not specify the target sum in the normalize_total function. The authors should ensure that the data values before the logarithmic transformation span several orders of magnitude (e.g., 0-10,000); if normalization is performed to a sum of 1, the log transformation becomes ineffective.

      Referee cross-commenting

      I agree with Reviewer #2 observation that the paper's contribution is primarily technical; however, I consider this technical advance to be an important and timely one that will enable many biologists to apply archetypal analysis more effectively in their own work.

      Significance

      This study presents ParTIpy, a Python-based implementation of Pareto Task Inference (ParTI) that makes archetypal analysis more accessible, scalable, and compatible with modern single-cell and spatial transcriptomics workflows. Its main strength lies in translating a conceptually powerful but technically limited MATLAB framework into an open-source, efficient Python package, enabling wider use in computational biology. The package is well-documented, which further enhances its accessibility and adoption potential, though documentation could be improved to enhance reproducibility and ease of use. It will be of interest to computational systems biologists, particularly those working with omics data, and those interested in studying functional trade-offs and resource allocation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to referee comments: ____RC-2025-03008


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary In this article, the authors used the synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in Trypanosoma brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to detect and identified, using YFP-pulldown, specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and the protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosome.

      Major comments: Are the key conclusions convincing? The authors reported that they have successfully used TALE-based affinity selection of protein-associated with repetitive sequences in the T. brucei genome. They claimed that this study has provided new information regarding the relevance of the repetitive region in the genome to chromosome integrity, telomere biology, chromosomal segregation and immune evasion strategies. These conclusions are based on high-quality research, and it is, basically, merits publication, provided that some major concerns, raised below, will be addressed before acceptance for publication. 1. The authors used TALE-YFP approach to examine the proteome associated with five different repetitive regions of the T. brucei genome and confirmed the binding of TALE-YFP with Chip-seq analyses. Ultimately, they got the list of proteins that bound to synthetic proteins, by affinity purification and LS-MS analysis and concluded that these proteins bind to different repetitive regions of the genome. There are two control proteins, one is TRF-YFP and the other KKT2-YFP, used to confirm the interactions. However, there are no experiment that confirms that the analysis gives some insight into the role of any putative or new protein in telomere biology, VSG gene regulation or chromosomal segregation. The proteins, which have already been reported by other studies, are mentioned. Although the author discovered many proteins in these repetitive regions, their role is yet unknown. It is recommended to take one or more of the new putative proteins from the repetitive elements and show whether or not they (1) bind directly to the specific repetitive sequence (e.g., by EMSA); (2) it is recommended that the authors will knockdown of one or a small sample of the new discovered proteins, which may shed light on their function at the repetitive region, as a proof of concept.

      Response

      The main request from Referee 1 is for individual evaluation of protein-DNA interaction for a few candidates identified in our TALE-YFP affinity purifications, particularly using EMSA to identify binding to the DNA repeats used for the TALE selection. In our opinion, such an approach would not actually provide the validation anticipated by the reviewer. The power of TALE-YFP affinity selection is that it enriches for protein complexes that associate with the chromatin that coats the target DNA repetitive elements rather than only identifying individual proteins or components of a complex that directly bind to DNA assembled in chromatin.

      The referee suggests we express recombinant proteins and perform EMSA for selected candidates, but many of the identified proteins are unlikely to directly bind to DNA - they are more likely to associate with a combination of features present in DNA and/or chromatin (e.g. specific histone variants or histone post-translational modifications). Of course, a positive result would provide some validation but only IF the tested protein can bind DNA in isolation - thus, a negative result would be uninformative.

      In fact, our finding that KKT proteins are enriched using the 177R-TALE (minichromosome repeat sequence) identifies components of the trypanosome kinetochore known (KKT2) or predicted (KKT3) to directly bind DNA (Marciano et al., 2021; PMID: 34081090), and likewise the TelR-TALE identifies the TRF component that is known to directly associate with telomeric (TTAGGG)n repeats (Reis et al 2018; PMID: 29385523). This provides reassurance on the specificity of the selection, as does the lack of cross selectivity between different TALEs used (see later point 3 below). The enrichment of the respective DNA repeats quantitated in Figure 2B (originally Figure S1) also provides strong evidence for TALE selectivity.

      It is very likely that most of the components enriched on the repetitive elements targeted by our TALE-YFP proteins do not bind repetitive DNA directly. The TRF telomere binding protein is an exception - but it is the only obvious DNA binding protein amongst the many proteins identified as being enriched in our TelR-TALE-YFP and TRF-YFP affinity selections.

      The referee also suggests that follow up experiments using knockdown of the identified proteins found to be enriched on repetitive DNA elements would be informative. In our opinion, this manuscript presents the development of a new methodology previously not applied to trypanosomes, and referee 2 highlights the value of this methodological development which will be relevant for a large community of kinetoplastid researchers. In-depth follow-up analyses would be beyond the scope of this current study but of course will be pursued in future. To be meaningful such knockdown analyses would need to be comprehensive in terms of their phenotypic characterisation (e.g. quantitative effects on chromosome biology and cell cycle progression, rates and mechanism of recombination underlying antigenic variation, etc) - simple RNAi knockdowns would provide information on fitness but little more. This information is already publicly available from genome-wide RNAi screens (www.tritrypDB.org), with further information on protein location available from the genome-wide protein localisation resource (Tryptag.org). Hence basic information is available on all targets selected by the TALEs after RNAi knock down but in-depth follow-up functional analysis of several proteins would require specific targeted assays beyond the scope of this study.

      NonR-TALE-YFP does not have a binding site in the genome, but YFP protein should still be expressed by T. brucei clones with NLS. The authors have to explain why there is no signal detected in the nucleus, while a prominent signal was detected near kDNA (see Fig.2). Why is the expression of YFP in NonR-TALE almost not shown compared to other TALE clones?

      Response

      The NonR-TALE-YFP immunolocalisation signal indeed is apparently located close to the kDNA and away from the nucleus. We are not sure why this is so, but the construct is sequence validated and correct. However, we note that artefactual localisation of proteins fused to a globular eGFP tag, compared to a short linear epitope V5 tag, near to the kinetoplast has been previously reported (Pyrih et al, 2023; PMID: 37669165),

      The expression of NonR-TALE-YFP is shown in Supplementary Fig. S2 in comparison to other TALE proteins. Although it is evident that NonR-TALE-YFP is expressed at lower levels than other TALEs (the different TALEs have different expression levels), it is likely that in each case the TALE proteins would be in relative excess.

      It is possible that the absence of a target sequence for the NonR-TALE-YFP in the nucleus affects its stability and cellular location. Understanding these differences is tangential to the aim of this study.

      However, importantly, NonR-TALE-YFP is not the only control for used for specificity in our affinity purifications. Instead, the lack of cross-selection of the same proteins by different TALEs (e.g. TelR-TALE-YFP, 177R-TALE-YFP) and the lack of enrichment of any proteins of interest by the well expressed ingiR-TALE-YFP or 147R-TALE-YFP proteins each provide strong evidence for the specificity of the selection using TALEs, as does the enrichment of similar protein sets following affinity purification of the TelR-TALE-YFP and TRF-YFP proteins which both bind telomeric (TTAGGG)n repeats. Moreover, control affinity purifications to assess background were performed using cells that completely lack an expressed YFP protein which further support specificity (Figure 6).

      We have added text to highlight these important points in the revised manuscript:

      Page 8:

      "However, the expression level of NonR-TALE-YFP was lower than other TALE-YFP proteins; this may relate to the lack of DNA binding sites for NonR-TALE-YFP in the nucleus."

      Page 8:

      "NonR-TALE-YFP displayed a diffuse nuclear and cytoplasmic signal; unexpectedly the cytoplasmic signal appeared to be in the vicinity the kDNA of the kinetoplast (mitochrondria). We note that artefactual localisation of some proteins fused to an eGFP tag has previously been observed in T. brucei (Pyrih et al, 2023)."

      Page 10:

      Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4). Thus, the most enriched proteins are specific to TelR-TALE-YFP-associated chromatin rather than to the TALE-YFP synthetic protein module or other chromatin.

      As a proof of concept, the author showed that the TALE method determined the same interacting partners enrichment in TelR-TALE as compared to TRF-YFP. And they show the same interacting partners for other TALE proteins, whether compared with WT cells or with the NonR-TALE parasites. It may be because NonR-TALE parasites have almost no (or very little) YFP expression (see Fig. S3) as compared to other TALE clones and the TRF-YFP clone. To address this concern, there should be a control included, with proper YFP expression.

      Response

      See response to point 2, but we reiterate that the ingi-TALE -YFP and 147R-TALE-YFP proteins are well expressed (western original Fig. S3 now Fig. S2) but few proteins are detected as being enriched or correspond to those enriched in TelR-TALE-YFP or TRF-YFP affinity purifications (see Fig. S9). Therefore, the ingi-TALE -YFP and 147R-TALE-YFP proteins provide good additional negative controls for specificity as requested. To further reassure the referee we have also included additional volcano plots which compare TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP to the ingiR-TALE-YFP affinity selection (new Figure S8). As with No-YFP or NonR-TALE-YFP controls, the use of ingiR-TALE-YFP as a negative control demonstrates that known telomere associated proteins are enriched in TelR-TALE-YFP affinity purification, RPA subunits enriched with 70R-TALE-YFP and Kinetochore KKT poroteins enriched with 177R-TALE-YFP. These analyses demonstrate specificity in the proteins enriched following affinity purification of our different TALE-YFPs and provide support to strengthen our original findings.

      We now refer to use of No-YFP, NonR-TALE-YFP, and ingiR-TALE -YFP as controls for comparison to TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP in several places:

      Page10:

      "Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4)."

      Page 11:

      "Thus, the nuclear ingiR-TALE-YFP provides an additional chromatin-associated negative control for affinity purifications with the TelR-TALE-YFP, 70R-TALE-YFP and 177R-TALE-YFP proteins (Fig. S8)."

      "Proteins identified as being enriched with 70R-TALE-YFP (Figure 6D) were similar in comparisons with either the No-YFP, NonR-TALE-YFP or ingiR-TALE-YFP as negative controls."

      Top Page 12:

      "The same kinetochore proteins were enriched regardless of whether the 177R-TALE proteomics data was compared with No-YFP, NonR-TALE or ingiR-TALE-YFP controls."

      Discussion Page 13:

      "Regardless, the 147R-TALE and ingiR-TALE proteins were well expressed in T. brucei cells, but their affinity selection did not significantly enrich for any relevant proteins. Thus, 147R-TALE and ingiR-TALE provide reassurance for the overall specificity for proteins enriched TelR-TALE, 70R-TALE and 177R-TALE affinity purifications."

      After the artificial expression of repetitive sequence binding five-TALE proteins, the question is if there is any competition for the TALE proteins with the corresponding endogenous proteins? Is there any effect on parasite survival or health, compared to the control after the expression of these five TALEs YFP protein? It is recommended to add parasite growth curves, for all the TALE-proteins expressing cultures.

      Response

      Growth curves for cells expressing TelR-TALE-YFP, 177R-TALE-YFP and ingiR-TALE-YFP are now included (New Fig S3A). No deficit in growth was evident while passaging 70R-TALE-YFP, 147R-TALE-YFP, NonR-TALE-YFP cell lines (indeed they grew slightly better than controls).

      The following text has been added page 8:

      "Cell lines expressing representative TALE-YFP proteins displayed no fitness deficit (Fig. S3A)."

      Since the experiments were performed using whole-cell extracts without prior nuclear fractionation, the authors should consider the possibility that some identified proteins may have originated from compartments other than the nucleus. Specifically, the detection of certain binding proteins might reflect sequence homology (or partial homology) between mitochondrial DNA (maxicircles and minicircles) and repetitive regions in the nuclear genome. Additionally, the lack of subcellular separation raises the concern that cytoplasmic proteins could have been co-purified due to whole cell lysis, making it challenging to discern whether the observed proteome truly represents the nuclear interactome.

      Response

      In our experimental design, we confirmed bioinformatically that the repeat sequences targeted were not represented elsewhere in the nuclear or mitochondrial genome (kDNA). The absence of subcellular fractionation could result in some cytoplasmic protein selection, but this is unlikely since each TALE targets a specific DNA sequence but is otherwise identical such that cross-selection of the same contaminating protein set would be anticipated if there was significant non-specific binding. We have previously successfully affinity selected 15 chromatin modifiers and identified associated proteins without major issues concerning cytoplasmic protein contamination (Staneva et al 2021 and 2022; PMID: 34407985 and 36169304). Of course, the possibility that some proteins are contaminants will need to be borne in mind in any future follow-up analysis of proteins of interest that we identified as being enriched on specific types of repetitive element in T. brucei. Proteins that are also detected in negative control, or negative affinity selections such as No-YFP, NoR-YFP, IngiR-TALE or 147R-TALE must be disregarded.

      '6'. Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? As mentioned earlier, the author claimed that this study has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies. But there are no experiments that provides a role for any unknown or known protein in these processes. Thus, it is suggested to select one or two proteins of choice from the list and validate their direct binding to repetitive region(s), and their role in that region of interaction.

      Response

      As highlighted in response to point 1 the suggested validation and follow up experiments may well not be informative and are beyond the scope of the methodological development presented in this manuscript. Referee 2 describes the study in its current form as "a significant conceptual and technical advancement" and "This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology."

      The Referee's phrase 'validate their direct binding to repetitive region(s)' here may also mean to test if any of the additional proteins that we identified as being enriched with a specific TALE protein actually display enrichment over the repeat regions when examined by an orthogonal method. A key unexpected finding was that kinetochore proteins including KKT2 are enriched in our affinity purifications of the 177R-TALE-YFP that targets 177bp repeats (Figure 6F). By conducting ChIP-seq for the kinetochore specific protein KKT2 using YFP-KKT2 we confirmed that KKT2 is indeed enriched on 177bp repeat DNA but not flanking DNA (Figure 7). Moreover, several known telomere-associated proteins are detected in our affinity selections of TelR-TALE-YFP (Figure 6B, FigS6; see also Reis et al, 2018 Nuc. Acids Res. PMID: 29385523; Weisert et al, 2024 Sci. Reports PMID: 39681615).

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. The answer for this question depends on what the authors want to present as the achievements of the present study. If the achievement of the paper was is the creation of a new tool for discovering new proteins, associated with the repeat regions, I recommend that they add a proof for direct interactions between a sample the newly discovered proteins and the relevant repeats, as a proof of concept discussed above, However, if the authors like to claim that the study achieved new functional insights for these interactions they will have to expand the study, as mentioned above, to support the proof of concept.

      Response

      See our response to point 1 and the point we labelled '6' above.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. I think that they are realistic. If the authors decided to check the capacity of a small sample of proteins (which was unknown before as a repetitive region binding proteins) to interacts directly with the repeated sequence, it will substantially add of the study (e.g., by EMSA; estimated time: 1 months). If the authors will decide to check the also the function of one of at least one such a newly detected proteins (e.g., by KD), I estimate the will take 3-6 months.

      Response

      As highlighted previously the proposed EMSA experiment may well be uninformative for protein complex components identified in our study or for isolated proteins that directly bind DNA in the context of a complex and chromatin. RNAi knockdown data and cell location data (as well as developmental expression and orthology data) is already available through tritrypDB.org and trtyptag.org

      Are the data and the methods presented in such a way that they can be reproduced? Yes

      Are the experiments adequately replicated, and statistical analysis adequate? The authors did not mention replicates. There is no statistical analysis mentioned.

      Response

      The figure legends indicate that all volcano plots of TALE affinity selections were derived from three biological replicates. Cutoffs used for significance: PFor ChiP-seq two biological replicates were analysed for each cell line expressing the specific YFP tagged protein of interest (TALE or KKT2). This is now stated in the relevant figure legends - apologies for this oversight. The resulting data are available for scrutiny at GEO: GSE295698.

      Minor comments: -Specific experimental issues that are easily addressable. The following suggestions can be incorporated: 1. Page 18, in the material method section author mentioned four drugs: Blasticidine, Phleomycin and G418, and hygromycin. It is recommended to mention the purpose of using these selective drugs for the parasite. If clonal selection has been done, then it should also be mentioned.

      Response

      We erroneously added information on several drugs used for selection in our labaoratory. In fact all TALE-YFP construct carry the Bleomycin resistance genes which we select for using Phleomycin. Also, clones were derived by limiting dilution immediately after transfection.

      We have amended the text accordingly:

      Page 17/18:

      "Cell cultures were maintained below 3 x 106 cells/ml. Pleomycin 2.5 mg/ml was used to select transformants containing the TALE construct BleoR gene."

      "Electroporated bloodstream cells were added to 30 ml HMI-9 medium and two 10-fold serial dilutions were performed in order to isolate clonal Pleomycin resistant populations from the transfection. 1 ml of transfected cells were plated per well on 24-well plates (1 plate per serial dilution) and incubated at 37{degree sign}C and 5% CO2 for a minimum of 6 h before adding 1 ml media containing 2X concentration Pleomycin (5 mg/ml) per well."

      In the method section the authors mentioned that there is only one site for binding of NonR-TALE in the parasite genome. But in Fig. 1C, the authors showed zero binding site. So, there is one binding site for NonR-TALE-YFP in the genome or zero?

      Response

      We thank the reviewer for pointing out this discrepancy. We have checked the latest Tb427v12 genome assembly for predicted NonR-TALE binding sites and there are no exact matches. We have corrected the text accordingly.

      Page 7:

      "A control NonR-TALE protein was also designed which was predicted to have no target sequence in the T. bruceigenome."

      Page 17:

      "A control NonR-TALE predicted to have no recognised target in the T. brucei geneome was designed as follows: BLAST searches were used to identify exact matches in the TREU927 reference genome. Candidate sequences with one or more match were discarded."

      The authors used two different anti-GFP antibodies, one from Roche and the other from Thermo Fisher. Why were two different antibodies used for the same protein?

      Response

      We have found that only some anti-GFP antibodies are effective for affinity selection of associated proteins, whereas others are better suited for immunolocalisation. The respective suppliers' antibodies were optimised for each application.

      Page 6: in the introduction, the authors give the number of total VSG genes as 2,634. Is it known how many of them are pseudogenes?

      Response

      This value corresponds to the number reported by Consentino et al. 2021 (PMID: 34541528) for subtelomeric VSGs, which is similar to the value reported by Muller et al 2018 (PMID: 30333624) (2486), both in the same strain of trypanosomes as used by us. Based on the earlier analysis by Cross et al (PMID: 24992042), 80% of the identified VSGs in their study (2584) are pseudogenes. This approximates to the estimation by Consentino of 346/2634 (13%) being fully functional VSG genes at subtelomeres, or 17% when considering VSGs at all genomic locations (433/2872).

      I found several typos throughout the manuscript.

      Response

      Thank you for raising this, we have read through the manuscipt several times and hopefully corrected all outstanding typos.

      Fig. 1C: Table: below TOTAL 2nd line: the number should be 1838 (rather than 1828)

      Corrected- thank you.

      • Are prior studies referenced appropriately? Yes

      • Are the text and figures clear and accurate? Yes

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Suggested above

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field: This study represents a significant conceptual and technical advancement by employing a synthetic TALE DNA-binding protein tagged with YFP to selectively identify proteins associated with five distinct repetitive regions of T. brucei chromatin. To the best of my knowledge, it is the first report to utilize TALE-YFP for affinity-based isolation of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology. Importantly, any essential or unique interacting partners identified could serve as potential targets for therapeutic intervention.

      • Place the work in the context of the existing literature (provide references, where appropriate). I agree with the information that has already described in the submitted manuscript, regarding its potential addition of the data resulted and the technology established to the study of VSGs expression, kinetochore mechanism and telomere biology.

      • State what audience might be interested in and influenced by the reported findings. These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as scientists investigating chromatin structure and the functional roles of repetitive genomic elements in higher eukaryotes.

      • 1Define your field of expertise with a few keywords to help the authors contextualize your point of view. 2Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. (1) Protein-DNA interactions/ chromatin/ DNA replication/ Trypanosomes (2) None

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: 1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. 2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

      Major Comments

      None. The experiments are well-controlled, claims well-supported, and methods clearly described. Conclusions are convincing.

      Response Thank you for these positive comments.

      Minor Comments

      1) Fig. 2 - I couldn't find an uncropped version showing multiple cells. If it exists, it should be linked in the legend or main text; Otherwise, this should be added to the supplement.

      Response

      The images presented represent reproducible analyses, and independently verified by two of the authors. Although wider field of view images do not provide the resolution to be informative on cell location, as requested we have provided uncropped images in new Fig. S4 for all the cell lines shown in Figure 2A.

      In addition, we have included as supplementary images (Fig. S3B) additional images of TelR-TALE-YFP, 177R-TALE-YFP and ingiR-TALE YFP localisation to provide additional support their observed locations presented in Figure 1. The set of cells and images presented in Figure 2A and in Fig S3B were prepared and obtained by a different authors, independently and reproducibly validating the location of the tagged protein.

      2) I think Suppl. Fig. 1 is very valuable, as it is a quantification and summary of the ChIP-seq data. I think the authors could consider making this a panel of a main figure. For the main figure, I think the plot could be trimmed down to only show the background and the relevant repeat for each TALE protein, leaving out the non-target repeats. (This relates to minor comment 6.) Also, I believe, it was not explained how background enrichment was calculated.

      Response

      We are grateful for the reviewer's positive view of original Fig. S1 and appreciate the suggestion. We have now moved these analysis to part B of main Figure 2 in the revised manuscript - now Figure 2B. We have also provided additional details in the Methods section on the approaches used to assess background enrichment.

      Page 19:

      Background enrichment calculation

      The genome was divided into 50 bp sliding windows, and each window was annotated based on overlapping genomic features, including CIR147, 177 bp repeats, 70 bp repeats, and telomeric (TTAGGG)n repeats. Windows that did not overlap with any of these annotated repeat elements were defined as "background" regions and used to establish the baseline ChIP-seq signal. Enrichment for each window was calculated using bamCompare, as log₂(IP/Input). To adjust for background signal amongst all samples, enrichment values for each sample were further normalized against the corresponding No-YFP ChIP-seq dataset.

      Note: While revising the manuscript we also noticed that the script had a nomalization error. We have therefore included a corrected version of these analyses as Figure 2B (old Fig. S1)

      3) Generally, I would plot enrichment on a log2 axis. This concerns several figures with ChIP-seq data.

      Response

      Our ChIP-seq enrichment is calculated by bamCompare. The resulting enrichment values are indeed log2 (IP/Input). We have made this clear in the updated figures/legends.

      4) Fig. 4C - The violin plots are very hard to interpret, as the plots are very narrow compared to the line thickness, making it hard to judge the actual volume. For example, in Centromere 5, YFP-KKT2 is less enriched than 147R-TALE over most of the centromere with some peaks of much higher enrichment (as visible in panel B), however, in panel C, it is very hard to see this same information. I'm sure there is some way to present this better, either using a different type of plot or by improving the spacing of the existing plot.

      Response

      We thank the reviewer for this suggestion; we have elected to provide a Split-Violin plot instead. This improves the presentation of the data for each centromere. The original violin plot in Figure 4C has been replaced with this Split-Violin plot (still Figure 4C).

      5) Fig. 6 - The panels are missing an x-axis label (although it is obvious from the plot what is displayed). Maybe the "WT NO-YFP vs" part that is repeated in all the plot titles could be removed from the title and only be part of the x-axis label?

      Response

      In fact, to save space the X axis was labelled inside each volcano plot but we neglected to indicate that values are a log2 scale indicating enrichment. This has been rectified - see Figure 6, and Fig. S7, S8 and S9.

      6) Fig. 7 - I would like to have a quantification for the examples shown here. In fact, such a quantification already exists in Suppl. Figure 1. I think the relevant plots of that quantification (YFP-KKT2 over 177bp-repeats and centromere-repeats) with some control could be included in Fig. 7 as panel C. This opportunity could be used to show enrichment separated out for intermediate-sized, mini-, and megabase-chromosomes. (relates to minor comment 2 & 8)

      Response

      The CIR147 sequence is found exclusively on megabase-sized chromosomes, while the 177 bp repeats are located on intermediate- and mini-sized chromosomes. Due to limitations in the current genome assembly, it is not possible to reliably classify all chromosomes into intermediate- or mini- sized categories based on their length. Therefore, original Supplementary Fig. S1 presented the YFP-KKT2 enrichment over CIR147 and 177 bp repeats as a representative comparison between megabase chromosomes and the remaining chromosomes (corrected version now presented as main Figure 2B). Additionally, to allow direct comparison of YFP-KKT2 enrichment on CIR147 and 177 bp repeats we have included a new plot in Figure 7C which shows the relative enrichment of YFP-KKT2 on these two repeat types.

      We have added the following text , page 12:

      "Taking into account the relative to the number of CIR147 and 177 bp repeats in the current T.brucei genome (Cosentino et al., 2021; Rabuffo et al., 2024), comparative analyses demonstrated that YFP-KKT2 is enriched on both CIR147 and 177 bp repeats (Figure 7C)."

      7) Suppl. Fig. 8 A - I believe there is a mistake here: KKT5 occurs twice in the plot, the one in the overlap region should be KKT1-4 instead, correct?

      Response

      Thanks for spotting this. It has been corrected

      8) The way that the authors mapped ChIP-seq data is potentially problematic when analyzing the same repeat type in different regions of the genome. The authors assigned reads that had multiple equally good mapping positions to one of these mapping positions, randomly. This is perfectly fine when analysing repeats by their type, independent of their position on the genome, which is what the authors did for the main conclusions of the work. However, several figures show the same type of repeat at different positions in the genome. Here, the authors risk that enrichment in one region of the genome 'spills' over to all other regions with the same sequence. Particularly, where they show YFP-KKT2 enrichment over intermediate- and mini-chromosomes (Fig. 7) due to the spillover, one cannot be sure to have found KKT2 in both regions. Instead, the authors could analyze only uniquely mapping reads / read-pairs where at least one mate is uniquely mapping. I realize that with this strict filtering, data will be much more sparse. Hence, I would suggest keeping the original plots and adding one more quantification where the enrichment over the whole region (e.g., all 177bp repeats on intermediate-/mini-chromosomes) is plotted using the unique reads (this could even be supplementary). This also applies to Fig. 4 B & C.

      Response

      We thank the reviewer for their thoughtful comments. Repetitive sequences are indeed challenging to analyze accurately, particularly in the context of short read ChIP-seq data. In our study, we aimed to address YFP-KKT2 enrichment not only over CIR147 repeats but also on 177 bp repeats, using both ChIP-seq and proteomics using synthetic TALE proteins targeted to the different repeat types. We appreciate the referees suggestion to consider uniquely mapped reads, however, in the updated genome assembly, the 177 bp repeats are frequently immediately followed by long stretches of 70 bp repeats which can span several kilobases. The size and repetitive nature of these regions exceeds the resolution limits of ChIP-seq. It is therefore difficult to precisely quantify enrichment across all chromosomes.

      Additionally, the repeat sequences are highly similar, and relying solely on uniquely mapped reads would result in the exclusion of most reads originating from these regions, significantly underestimating the relative signals. To address this, we used Bowtie2 with settings that allow multi-mapping, assigning reads randomly among equivalent mapping positions, but ensuring each read is counted only once. This approach is designed to evenly distribute signal across all repetitive regions and preserve a meaningful average.

      Single molecule methods such as DiMeLo (Altemose et al. 2022; PMID: 35396487) will need to be developed for T. brucei to allow more accurate and chromosome specific mapping of kinetochore or telomere protein occupancy at repeat-unique sequence boundaries on individual chromosomes.

      Reviewer #2 (Significance (Required)):

      This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

      Response

      Thank you for supporting the novelty and broad interest of our manuscript

      My field of expertise / Point of view:

      I'm a computer scientist by training and am now a postdoctoral bioinformatician in a molecular parasitology laboratory. The laboratory is working on antigenic variation in T. brucei. The focus of my work is on analyzing sequencing data (such as ChIP-seq data) and algorithmically improving bioinformatic tools.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: 1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. 2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

      Major Comments

      None. The experiments are well-controlled, claims well-supported, and methods clearly described. Conclusions are convincing.

      Minor Comments

      1. Fig. 2 - I couldn't find an uncropped version showing multiple cells. If it exists, it should be linked in the legend or main text; Otherwise, this should be added to the supplement.
      2. I think Suppl. Fig. 1 is very valuable, as it is a quantification and summary of the ChIP-seq data. I think the authors could consider making this a panel of a main figure. For the main figure, I think the plot could be trimmed down to only show the background and the relevant repeat for each TALE protein, leaving out the non-target repeats. (This relates to minor comment 6.) Also, I believe, it was not explained how background enrichment was calculated.
      3. Generally, I would plot enrichment on a log2 axis. This concerns several figures with ChIP-seq data.
      4. Fig. 4C - The violin plots are very hard to interpret, as the plots are very narrow compared to the line thickness, making it hard to judge the actual volume. For example, in Centromere 5, YFP-KKT2 is less enriched than 147R-TALE over most of the centromere with some peaks of much higher enrichment (as visible in panel B), however, in panel C, it is very hard to see this same information. I'm sure there is some way to present this better, either using a different type of plot or by improving the spacing of the existing plot.
      5. Fig. 6 - The panels are missing an x-axis label (although it is obvious from the plot what is displayed). Maybe the "WT NO-YFP vs" part that is repeated in all the plot titles could be removed from the title and only be part of the x-axis label?
      6. Fig. 7 - I would like to have a quantification for the examples shown here. In fact, such a quantification already exists in Suppl. Figure 1. I think the relevant plots of that quantification (YFP-KKT2 over 177bp-repeats and centromere-repeats) with some control could be included in Fig. 7 as panel C. This opportunity could be used to show enrichment separated out for intermediate-sized, mini-, and megabase-chromosomes. (relates to minor comment 2 & 8)
      7. Suppl. Fig. 8 A - I believe there is a mistake here: KKT5 occurs twice in the plot, the one in the overlap region should be KKT1-4 instead, correct?
      8. The way that the authors mapped ChIP-seq data is potentially problematic when analyzing the same repeat type in different regions of the genome. The authors assigned reads that had multiple equally good mapping positions to one of these mapping positions, randomly. This is perfectly fine when analyzing repeats by their type, independent of their position on the genome, which is what the authors did for the main conclusions of the work. However, several figures show the same type of repeat at different positions in the genome. Here, the authors risk that enrichment in one region of the genome 'spills' over to all other regions with the same sequence. Particularly, where they show YFP-KKT2 enrichment over intermediate- and mini-chromosomes (Fig. 7) due to the spillover, one cannot be sure to have found KKT2 in both regions. Instead, the authors could analyze only uniquely mapping reads / read-pairs where at least one mate is uniquely mapping. I realize that with this strict filtering, data will be much more sparse. Hence, I would suggest keeping the original plots and adding one more quantification where the enrichment over the whole region (e.g., all 177bp repeats on intermediate-/mini-chromosomes) is plotted using the unique reads (this could even be supplementary). This also applies to Fig. 4 B & C.

      Significance

      This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

      My field of expertise / Point of view:

      I'm a computer scientist by training and am now a postdoctoral bioinformatician in a molecular parasitology laboratory. The laboratory is working on antigenic variation in T. brucei. The focus of my work is on analyzing sequencing data (such as ChIP-seq data) and algorithmically improving bioinformatic tools.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this article, the authors used the synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in Trypanosoma brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to detect and identified, using YFP-pulldown, specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and the protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosome.

      Major comments:

      Are the key conclusions convincing?

      The authors reported that they have successfully used TALE-based affinity selection of protein-associated with repetitive sequences in the T. brucei genome. They claimed that this study has provided new information regarding the relevance of the repetitive region in the genome to chromosome integrity, telomere biology, chromosomal segregation and immune evasion strategies. These conclusions are based on high-quality research and it is, basically, merits publication, provided that some major concerns, raised below, will be addressed before acceptance for publication. 1. The authors used TALE-YFP approach to examine the proteome associated with five different repetitive regions of the T. brucei genome and confirmed the binding of TALE-YFP with Chip-seq analyses. Ultimately, they got the list of proteins that bound to synthetic proteins, by affinity purification and LS-MS analysis and concluded that these proteins bind to different repetitive regions of the genome. There are two control proteins, one is TRF-YFP and the other KKT2-YFP, used to confirm the interactions. However, there are no experiment that confirms that the analysis gives some insight into the role of any putative or new protein in telomere biology, VSG gene regulation or chromosomal segregation. The proteins, which have already been reported by other studies, are mentioned. Although the author discovered many proteins in these repetitive regions, their role is yet unknown. It is recommended to take one or more of the new putative proteins from the repetitive elements and show whether or not they (1) bind directly to the specific repetitive sequence (e.g., by EMSA); (2) it is recommended that the authors will knockdown of one or a small sample of the new discovered proteins, which may shed light on their function at the repetitive region, as a proof of concept. 2. NonR-TALE-YFP does not have a binding site in the genome, but YFP protein should still be expressed by T. brucei clones with NLS. The authors have to explain why there is no signal detected in the nucleus, while a prominent signal was detected near kDNA (see Fig.2). Why is the expression of YFP in NonR-TALE almost not shown compared to other TALE clones? 3. As a proof of concept, the author showed that the TALE method determined the same interacting partners enrichment in TelR-TALE as compared to TRF-YFP. And they show the same interacting partners for other TALE proteins, whether compared with WT cells or with the NonR-TALE parasites. It may be because NonR-TALE parasites have almost no (or very little) YFP expression (see Fig. S3) as compared to other TALE clones and the TRF-YFP clone. To address this concern, there should be a control included, with proper YFP expression. 4. After the artificial expression of repetitive sequence binding five-TALE proteins, the question is if there is any competition for the TALE proteins with the corresponding endogenous proteins? Is there any effect on parasite survival or health, compared to the control after the expression of these five TALEs YFP protein? It is recommended to add parasite growth curves, for all the TALE-proteins expressing cultures. 5. Since the experiments were performed using whole-cell extracts without prior nuclear fractionation, the authors should consider the possibility that some identified proteins may have originated from compartments other than the nucleus. Specifically, the detection of certain binding proteins might reflect sequence homology (or partial homology) between mitochondrial DNA (maxicircles and minicircles) and repetitive regions in the nuclear genome. Additionally, the lack of subcellular separation raises the concern that cytoplasmic proteins could have been co-purified due to whole cell lysis, making it challenging to discern whether the observed proteome truly represents the nuclear interactome.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      As mentioned earlier, the author claimed that this study has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies. But there are no experiments that provides a role for any unknown or known protein in these processes. Thus, it is suggested to select one or two proteins of choice from the list and validate their direct binding to repetitive region(s), and their role in that region of interaction. <br /> Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. The answer for this question depends on what the authors want to present as the achievements of the present study. If the achievement of the paper was is the creation of a new tool for discovering new proteins, associated with the repeat regions, I recommend that they add a proof for direct interactions between a sample the newly discovered proteins and the relevant repeats, as a proof of concept discussed above, However, if the authors like to claim that the study achieved new functional insights for these interactions they will have to expand the study, as mentioned above, to support the proof of concept.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      I think that they are realistic. If the authors decided to check the capacity of a small sample of proteins (which was unknown before as a repetitive region binding proteins) to interacts directly with the repeated sequence, it will substantially add of the study (e.g., by EMSA; estimated time: 1 months). If the authors will decide to check the also the function of one of at least one such a newly detected proteins (e.g., by KD), I estimate the will take 3-6 months.

      Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      Are the experiments adequately replicated, and statistical analysis adequate?

      The authors did not mention replicates. There is no statistical analysis mentioned.

      Minor comments:

      Specific experimental issues that are easily addressable.

      The following suggestions can be incorporated:

      1. Page 18, in the material method section author mentioned four drugs: Blasticidine, Phleomycin and G418, and hygromycin. It is recommended to mention the purpose of using these selective drugs for the parasite. If clonal selection has been done, then it should also be mentioned.
      2. In the method section the authors mentioned that there is only one site for binding of NonR-TALE in the parasite genome. But in Fig. 1C, the authors showed zero binding site. So, there is one binding site for NonR-TALE-YFP in the genome or zero?
      3. The authors used two different anti-GFP antibodies, one from Roche and the other from Thermo Fisher. Why were two different antibodies used for the same protein?
      4. Page 6: in the introduction, the authors give the number of total VSG genes as 2,634. Is it known how many of them are pseudogenes?
      5. I found several typos throughout the manuscript.
      6. Fig. 1C: Table: below TOTAL 2nd line: the number should be 1838 (rather than 1828)

      Are prior studies referenced appropriately?

      Yes

      Are the text and figures clear and accurate?

      Yes

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Suggested above

      Significance

      Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field:

      This study represents a significant conceptual and technical advancement by employing a synthetic TALE DNA-binding protein tagged with YFP to selectively identify proteins associated with five distinct repetitive regions of T. brucei chromatin. To the best of my knowledge, it is the first report to utilize TALE-YFP for affinity-based isolation of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology. Importantly, any essential or unique interacting partners identified could serve as potential targets for therapeutic intervention.

      Place the work in the context of the existing literature (provide references, where appropriate).

      I agree with the information that has already described in the submitted manuscript, regarding its potential addition of the data resulted and the technology established to the study of VSGs expression, kinetochore mechanism and telomere biology.

      State what audience might be interested in and influenced by the reported findings.

      These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as scientists investigating chromatin structure and the functional roles of repetitive genomic elements in higher eukaryotes.

      1Define your field of expertise with a few keywords to help the authors contextualize your point of view. 2Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      1. Protein-DNA interactions/ chromatin/ DNA replication/ Trypanosomes
      2. None
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a reply/revision plan, not definitive. Planned and already implemented revisions are underlined.

      First of all, we wish to express our gratitude to the reviewers: they helped to improve the paper.

      Reviewer #1:* **

      Reviewer #1 wrote: Major Comments: 1.Differential gene/pathway analysis across epithelial clusters: What are the differential genes or pathways among the epithelial clusters? Without CCA/Harmony integration, do the tumor subgroups show distinct differences? In addition, I suggest applying NMF or hdWGCNA to identify shared modules and test whether ATC and PTC harbor overlapping regulatory modules.

      *

      Reply plan: Both reviewers suggested some regulatory network analysis. We proposed to run SCENIC+ (Nature Methods, 2023, https://doi.org/10.1038/s41592-023-01938-4) on our data__.__

      * Reviewer #1 wrote: 2.Validation of TSHR/TPO-based subgrouping: While the TSHR/TPO grouping appears appropriate for stratification at the single-cell level, it is necessary to exclude sequencing depth as a confounding factor. Should validate the existence of these subpopulations using mIHC/IF on corresponding samples. *

      __Reply plan: __We made claims about RNA expression, not protein expression. Thus, validation should be at the RNA level:

      • We already replicated part of our analysis on the dataset published by Lu et al. (JCI 2023, https://doi.org/10.1172/JCI169653), see Figs. 3 and 4. This effort will be extended to all single cell analysis results from our study in the revised paper.
      • We will also present plots demonstrating that the sequencing depth is similar in the different cancer cell subgroups-further excluding it as a confounding factor. Reviewer #1 wrote: *3.Impact of mutational differences on conclusions: According to Supplementary Table 1, almost all PTC cases carried BRAF mutations, whereas four ATC patients harbored no BRAF mutation. Could this difference influence the conclusions of the study? Although the authors briefly mention this in the Discussion, a more thorough clarification is warranted. *

      Reply plan: The dataset of Lu et al. includes BRAF-mutated ATCs along with BRAF-mutated PTCs. Therefore, the replication mentioned earlier will also address those concerns. In fact, Fig. 4E-I already confirm in Lu et al. data the ordered loss of markers. Replication will be extended to other results of the study and be more emphasized in the paper.

      * Reviewer #1 wrote: 4. The statement "Myeloid and T cells also grouped in specific clusters" seems descriptive. Is this clustering biologically meaningful? Please elaborate.

      *

      __Reply plan: __This is an important point, and accordingly, a cell mixing experiment was specifically designed to sort apart technical effects from biological effects. We therefore know with certainty that the myeloid and T cell patients-specific clusters are the result of biological variation (Fig. 1). We further demonstrate that part of this variation is associated with hypoxia (Supp. Fig 4). So yes, the clustering is biologically meaningful.

      * Reviewer #1 wrote: Minor Comments: In Figure 2C, the "Epith TSHR-" population resembles myeloid cells. Could the authors clarify why this is the case? For the correlation analysis in Figure 2C, were highly variable genes or all genes used?

      *

      Reply plan: There is a simple explanation: The Epith TSHR- population the reviewer is referring to are cells from anaplastic thyroid cancers (ATC), which are tumors notoriously infiltrated by macrophages (Supp. Fig. 4). A high correlation of Epith TSHR- and macrophages proportion across our panel of ATC and papillary cancer (PTC) is therefore expected. Among other things, Fig. 2C shows that high correlation, but it is not meant to and does not show that Epith TSHR- and macrophages "resemble" one another. It shows that their proportions are highly positively correlated. This correlation analysis does not rely on gene expression but on cell type proportions. It measures co-occurrence rather than resemblance. The text has been clarified in order to prevent any confusion.

      • *

      * __Reviewer #2: __

      Reviewer #2 wrote: 1. This study largely confirms established facts that 1) PTC due to BRAF driver mutation is a heterogeneous tumour entity and 2) ATC is the most dedifferentiated of all thyroid cancers. Although interesting, observations of a highly variable tissue cell composition including immune cells and the gradual loss of thyroid differentiation markers, in part linked to tumor subclone development featured by altered chromosomal copy numbers, are thus not surprising.

      *

      __Reply plan: __We wish to respectfully express our take on this perception of the work:

      • There is a difference between conjecturing a high heterogeneity in the cell composition of thyroid cancers and establishing it with the level of accuracy and quantitative rigor our analysis provides. The extreme amplitude of that variation was surprising to us: the size of the microenvironment makes from 8.4 to 80% of the cells in PTCs driven by the same BRAF mutation.
      • We don't simply show that a subclone characterized by a large number of copy number events is less differentiated. We go all the way proving that those copy number alterations are associated with specific cell states that produce specific histology (Fig. 5). It required a combination of single cell transcriptomics, spatial transcriptomics and sophisticated computational analysis to establish that connection between genomic changes and histology. The fragmentation of epithelial sheets uncovered from CNV analysis had escaped the attention of pathologist colleagues and ours at first, this is not a parameter typically assessed in diagnostic, to our knowledge.
      • We don't simply show that there is a gradual loss of differentiation markers: this loss is ordered in a very specific way that mirrors the gain of markers during thyroid organoid differentiation. * Reviewer #2 wrote: 2. Considering tumor progression, comparison of PTC and ATC should preferably include specimens with the same driver mutation (BRAF or RAS), which is not the case here. This notion should be more clearly explained to readers. An optional improvement would be to conduct similar analyses on an ATC specimen that contains more differentiated PTC tumor portions arguably suggesting that PTC progresses to ATC (by mechanisms that are still largely unexplored).

      *

      __Reply plan: __This is clearly a limitation of our study. As already proposed in our reply to reviewer number one, we will extend to all our single cell results the replication of our analysis in the dataset of Lu e al., which includes ATCs and PTCs harboring the BRAF-mutation.

      * Reviewer #2 wrote: 3. Comments on findings of lymphocytic infiltration need to be balanced. Although autoimmune thyroid disease in infered a risk factor of developing malignancy it is unlikely that the majority of TCGA samples of PTC is associated with thyroiditis as indicated in Fig. 3 and Suppl Fig. 3. Immune cell abundance may rather reflect the tumor immune microenvironment (TIME).

      *

      __Reply plan: __The figure the reviewer is referring to demonstrates that PTC occurring in a background of thyroiditis also has a higher proportion of B cells. We did not claim, and the figure did not show, that "the majority of TCGA samples of PTC is associated with thyroiditis", because they don't. This point has been clarified.

      * Reviewer #2 wrote: 4. Some tissue sections seem of quite poor quality either shape-wise of containing rifts e.g. PTC7 in Fig. 3 and PTC2 in Fig. 5. The authors should explain whether and how this might influence analysis.

      *

      __Reply plan: __Spatial transcriptomics is typically performed on frozen sections. Frozen sections, which are obviously of lower visual quality than slice from FFPE preserved samples. Since no computational analyses were performed on the image, this lower quality has no impact on our results. Regarding RNA quality, the RINs were >7 for all tumors. RINs are now presented in Supp table S1.

      Reviewer #2 wrote: The experiment on mouse ESC/organoids (Fig. 6H-J) does not show much of an expected enhanced thyroid progenitor cell proliferation after induction of the mutant Braf allele by tamoxifen, which raises doupt whether the subsequent promoted growth by fibronectin at all is oncogene-related. This differs from the impact of BrafCA activation along with mouse thyroid development in vivo (Schoultz et al iScience 2023 PMID: 37534159). In the same experimental setup, it appears that mutant Braf prevents follicle formation (Fig. 6I). A control experiment investigating the influence of fibronectin in the absence of oncogene activation should be conclusive. The effect of Braf and fibronectin on thyroid organoid structure and function should be better explained, if necessary based on complementary experiments, and discussed in relation to the claimed association of fibronectin expression to "...low amounts of thyroid differentiation markers...) and "...loss of epithelial structure (PTC7, Figure 6E)." in the previous section of Results.

      __Reply plan: __The induction of the mutant Braf allele for 7 days increases the percentage of BrdU+ cells by 1.43 fold (p-value for Wilcoxon test = 0.035). The effects observed by Schoultz et al. are certainly more dramatic, but they result from an oncogenic activity spanning 1 to 6 months (4 to 26 times longer) in an in vivo model. Most importantly, oncogenic activity is initiated in Nkx2.1+ cells and not Tg+ cells, thus much earlier during development. These two models are thus not comparable. As for the effects of fibronectin on thyroid structure, we do not claim that our organoid model recapitulates the complex interactions between cancer cells and their microenvironment that shapes tissue morphology in vivo. This is now clarified in the text.

      We presented controls with no oncogene expression and no Fn1, controls with oncogene induction and no FN1 and organoids with oncogene induction and Fn1 treatment. This alone establishes the effect of Fn1 on induced organoids, which was our goal. We regard it as a novel and interesting but non-essential development in our paper.

      As the reviewer points out, while our results show an increased proliferation in Braf-mutated organoids treated with Fn1, they do not allow us to conclude on any potential interaction between Fn1 and the oncogenic process. The suggested experiment with Fn1 in absence of oncogene activation would add information, but we cannot follow up for practical lab management reasons detailed in Section 4 below.

      * Reviewer #2 wrote: 6. Concerning EMT profiling (Supplementary Fig. 7B) , there is a great similarity of ATC tumor cells and fibroblasts, and as stated in the text the malignant status of the former is indicated by chromosomal aberrations (refering to Suppl fig. 6). However, looking at Suppl. Fig. 7B it is evident that fractions of cells identified as fibroblasts express TG and TSHR suggesting mismatch. How was this comparison done in order to exclude mismatch? Is there no other profiled markers that distinguish cancer cells from stromal cells that can support conclusions?*

      Reply plan: TG-a thyrocyte marker-seems expressed by fibroblasts in Supplementary Figure 7B. The reviewer suggests this could be caused by an incomplete distinction between bona fide fibroblasts and thyrocytes in advanced EMT state. We argue that

      • Ambient TG RNA leaking out of thyrocytes nuclei contaminates the transcriptomes of all cell types. It is a well-known technical problem, with dedicated software packages to mitigate it. We preprocessed our data with one of them, SoupX, which corrected for most, but not all, ambient RNA contamination.
      • The plot below shows that there is nothing special about fibroblasts in that respect. For example, B and T cells are contaminated by TG at levels comparable to fibroblasts, endothelial cells and pericyte to higher levels.
      • In addition, the UMAP of Fig. 2A shows that EMT cells and fibroblast form very distinct clusters. Furthermore, the fibroblast cluster but not the two EMT clusters contain cells from PTC, and the PTC cluster do not contain cells with DNA copy number aberration. Thus, although both EMT cells and fibroblasts express the typical mesenchymal marker of Supplementary Fig. 7B, they are easy to distinguish on the basis of their overall transcriptomes.
      • The panel below has been added to the Supplementary Figure 7B. [Panel cannot be displayed here]

      Reviewer #2 wrote: *In the same figure, it appears there are no clear differences in EMT marker expression among PTC samples regardless of differentiation state, suggesting that the gradual loss of thyroid differentiation in PTC tumor cells and EMT are not parallel and potentially linked phenomena? Please clarify this dissociation of results. Is possible that refocusing on other EMT markers than the top 10, of which almost all concerns various collagen genes, might better reveal partial EMT in PTCs?

      *

      __Reply plan: __The technical basis of this comment is related to the previous point. Our perception is that the mesenchymal markers in Supplementary Fig. 7B show a binary effect, i.e. strong expression in ATC and no expression in PTC (beyond ambient RNA noise)-not a gradual effect. Thus, there is no correlation of COL1A1 and other mesenchymal markers with dedifferentiation in PTC as these markers are not expressed beyond the noise level of the experiments. A lot has been written about EMT in PTC, but one of the findings of our study is that while ATC undergo full EMT, EMT in PTC is very limited. PTC express FN1 but no other major mesenchymal markers such as collagens I and III, for example.

      • *

      Reviewer #2 wrote: *7. According to Suppl. Table 1, the ATC2 tumor does not harbor any mutations. What about chromosomal aberrations, was that included in analysis? Considering previous consistent reports of a high mutation burden in ATC, if not supported by other data (clinical, pathological) the diagnosis might be questioned for this particular case included in multiple analyses of the present study.

      *

      Reply: There is little doubt about the diagnostic of ATC2 by our pathologist collaborators

      • The histology of this tumor is strikingly anaplastic, i.e. without structure, as shown in the image below.
      • This tumor has a high level of macrophages infiltration typical of ATCs (Supplementary Fig. 4).
      • Reviewer #2 wrote: Minor comments: -The logical order of presentation of Results might benefit from first presenting specific PTC data following by ATC dito. I´m thinking of swapping the section of EMT in ATC to end of Results.*

      *Reply plan: We miss why the reviewer thinks that way. We believe that discussing the microenvironment, then tumor cells bring conciseness and clarity about how we propose to stratify the latter. By contrast, the suggested tumor type-centered structure entails going back and forth between the microenvironment and tumor cells, diluting the messages about both.

      * Reviewer #2 wrote: -Methods paragraph "Mouse ESC-derived thyroid organoids experiments" (starting with "ccc") seems to be missing some essential information.

      *Reply plan: A sentence was missing, indeed, and has been re-introduced in the manuscript. We thank the reviewer for catching that error.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Expression pattern profiling of human thyroid cancer tissues by combining single cell/nuclei RNAseq analysis, spatial transcriptomics and immunofluorescence on corresponding tumor histologic sections. Papillary and anaplastic thyroid carcinomas (PTC n=10 and ATC n=4) were compared; some data were extracted from TCGA. The results indicate that ATCs consists of completely dedifferentiated tumor cells whereas PTCs show variable levels of dedifferentiation, which in a sense mimics the the reverse process of thyroid differentiation as observed in stem cell-based organoids. Moreover, PTC and ATC tumors show different levels of epithelial-mesenchymal transition. Fibronectin is inferred a role in promoting tumor growth, supported by functional studies on organoids. Authors suggest that global profiling of differentiation state is a promising technique to stratifiy tumor heterogeneity, with potentially might be useful distinguishing thyroid malignancies suitable or not to adjuvant treatment e.g. with radioiodine (RAI) therapy.

      Major comments:

      1. This study largely confirms established facts that 1) PTC due to BRAF driver mutation is a heterogeneous tumour entity and 2) ATC is the most dedifferentiated of all thyroid cancers. Although interesting, observations of a highly variable tissue cell composition including immune cells and the gradual loss of thyroid differentiation markers, in part linked to tumor subclone development featured by altered chromosomal copy numbers, are thus not surprising.
      2. Considering tumor progression, comparison of PTC and ATC should preferably include specimens with the same driver mutation (BRAF or RAS), which is not the case here. This notion should be more clearly explained to readers. An optional improvement would be to conduct similar analyses on an ATC specimen that contains more differentiated PTC tumor portions arguably suggesting that PTC progresses to ATC (by mechanisms that are still largely unexplored).
      3. Comments on findings of lymphocytic infiltration need to be balanced. Although autoimmune thyroid disease in infered a risk factor of developing malignancy it is unlikely that the majority of TCGA samples of PTC is associated with thyroiditis as indicated in Fig. 3 and Suppl Fig. 3. Immune cell abundance may rather reflect the tumor immune microenvironment (TIME).
      4. Some tissue sections seem of quite poor quality either shape-wise of containing rifts e.g. PTC7 in Fig. 3 and PTC2 in Fig. 5. The authors should explain whether and how this might influence analysis.
      5. The experiment on mouse ESC/organoids (Fig. 6H-J) does not show much of an expected enhanced thyroid progenitor cell proliferation after induction of the mutant Braf allele by tamoxifen, which raises doupt whether the subsequent promoted growth by fibronectin at all is oncogene-related. This differs from the impact of BrafCA activation along with mouse thyroid development in vivo (Schoultz et al iScience 2023 PMID: 37534159). In the same experimental setup, it appears that mutant Braf prevents follicle formation (Fig. 6I). A control experiment investigating the influence of fibronectin in the absence of oncogene activation should be conclusive. The effect of Braf and fibronectin on thyroid organoid structure and function should be better explained, if necessary based on complementary experiments, and discussed in relation to the claimed association of fibronectin expression to "...low amounts of thyroid differentiation markers...) and "...loss of epithelial structure (PTC7, Figure 6E)." in the previous section of Results.
      6. Concerning EMT profiling (Supplementary Fig. 7B) , there is a great similarity of ATC tumor cells and fibroblasts, and as stated in the text the malignant status of the former is indicated by chromosomal aberrations (refering to Suppl fig. 6). However, looking at Suppl. Fig. 7B it is evident that fractions of cells identified as fibroblasts express TG and TSHR suggesting mismatch. How was this comparison done in order to exclude mismatch? Is there no other profiled markers that distinguish cancer cells from stromal cells that can support conclusions? In the same figure, it appears there are no clear differences in EMT marker expression among PTC samples regardless of differentiation state, suggesting that the gradual loss of thyroid differentiation in PTC tumor cells and EMT are not parallel and potentially linked phenomena? Please clarify this dissociation of results. Is is possible that refocusing on other EMT markers than the top 10, of which almost all concerns various collagen genes, might better reveal partial EMT in PTCs?
      7. According to Suppl. Table 1, the ATC2 tumor does not harbor any mutations. What about chromosomal aberrations, was that included in analysis? Considering previous consistent reports of a high mutation burden in ATC, if not supported by other data (clinical, pathological) the diagnosis might be questioned for this particular case included in multiple analyses of the present study.

      Minor comments:

      • The logical order of presentation of Results might benefit from first presenting specific PTC data following by ATC dito. I´m thinking of swapping the section of EMT in ATC to end of Results.
      • Methods paragraph "Mouse ESC-derived thyroid organoids experiments" (starting with "ccc") seems to be missing some essential information.

      Significance

      The study confirms at single cell level the fundamental difference of PTC and ATC that is evident clinically and biologically, but does not address the intriguing issue how ATC may progress from PTC.

      Tumor heterogeneity of BRAFV600E-driven PTC in terms of dedifferentiation of functional parameters, which are of potential clinical relevance, is well documented.

      Reviewer expertise: thyroid development, thyroid cell and tumor biology, superficial knowledge in scRNAseq analysis

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study is well designed with a rational sample collection strategy. The authors collected PTC and ATC tissue samples for snRNA and spRNA sequencing, clearly characterizing tumor heterogeneity. Using representative thyroid differentiation markers (TSHR, TPO, TG, NIS), they distinguished different differentiation states of PTC and ATC and further validated the role of FN1 in organoid models. However, the manuscript is largely descriptive in nature, and several key issues remain to be addressed.

      Major Comments:

      1.Differential gene/pathway analysis across epithelial clusters: What are the differential genes or pathways among the epithelial clusters? Without CCA/Harmony integration, do the tumor subgroups show distinct differences? In addition, I suggest applying NMF or hdWGCNA to identify shared modules and test whether ATC and PTC harbor overlapping regulatory modules. 2.Validation of TSHR/TPO-based subgrouping: While the TSHR/TPO grouping appears appropriate for stratification at the single-cell level, it is necessary to exclude sequencing depth as a confounding factor. Should validate the existence of these subpopulations using mIHC/IF on corresponding samples. 3.Impact of mutational differences on conclusions: According to Supplementary Table 1, almost all PTC cases carried BRAF mutations, whereas four ATC patients harbored no BRAF mutation. Could this difference influence the conclusions of the study? Although the authors briefly mention this in the Discussion, a more thorough clarification is warranted. 4.The statement "Myeloid and T cells also grouped in specific clusters" seems descriptive. Is this clustering biologically meaningful? Please elaborate.

      Minor Comments:

      In Figure 2C, the "Epith TSHR-" population resembles myeloid cells. Could the authors clarify why this is the case? For the correlation analysis in Figure 2C, were highly variable genes or all genes used?

      Significance

      This study provides a comprehensive single-nucleus and spatial transcriptomic atlas of papillary and anaplastic thyroid carcinomas. Its strengths include well-designed sample collection, high-resolution profiling of tumor heterogeneity, and validation of FN1 function. By stratifying malignant cells with thyroid differentiation markers (TSHR, TPO, TG, NIS), the authors delineate differentiation states and highlight mechanisms of progression from PTC to ATC. However, the study remains mainly descriptive, and additional analyses of gene modules, pathway regulation would increase its conceptual depth. The findings will interest researchers in thyroid cancer, tumor heterogeneity, and the single-cell/spatial genomics field, with potential relevance for translational oncology.

      Field of expertise: thyroid cancer biology, single-cell and spatial transcriptomics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Detailed point-by-point response

      __ __The Reviewers provided suggestions to improve the manuscript, most notably by adding experiments to (1) further support the role of Stim and Orai in epidermal heat-off responses and (2) further characterize the thermosensory responses of epidermal cells. We additionally propose to include a new set of calcium imaging experiments to visualize nociceptor sensitization by epidermal cells.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      We agree that the specificity of the GAL4 driver is an important point. In a recent publication (Yoshino et al, eLife, 2025) we provide the most comprehensive analysis of larval epidermal GAL4 drivers published to date. Included in this study is expression analysis of R38F11-GAL4 demonstrating that it is indeed specifically expressed in the epidermis. Based on the detailed expression analysis and functional analysis provided in that paper, R38F11-GAL4 was chosen for these studies as it is both highly specific for epidermal cells and provides uniform expression across the body wall.

      In our revised manuscript, we will more clearly detail how the driver was chosen for this study and provide a citation to the prior work to accompany our description of R38F11-GAL4 as an epidermis-specific driver line.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      We appreciate the Reviewer’s perspective on the importance of characterizing the efficacy of the reagents we used in this study. However, we disagree with the characterization of the change in response as “marginal”. Our results demonstrate that epidermal knockdown of Stim or Orai causes a significant reduction in the heat-off response of epidermal cells and heat-induced nociceptive sensitization.

      In a prior published study (Yoshino et al, eLife, 2025) we validated for their efficacy of these RNAi lines in combination with the same GAL4 driver at the same developmental stage. Specifically, we demonstrated that R38F11GAL4-mediated expression of UAS-Stim RNAi or UAS-Orai RNAi significantly attenuated store operated calcium entry following story depletion by thapsigargin. In the revised manuscript, we will add a statement referring to this prior validation along with a citation. In light of this prior characterization, we disagree that additional RNAi lines are required to corroborate the results.

      The most salient point of the Reviewer’s comment is that additional evidence should be provided to demonstrate more convincingly the requirement of Stim/Orai in epidermal heat-off responses. We detail our plans to address this point below, but first address the specific experimental suggestions the Reviewer provides.

      First, the Reviewer suggests the use of a dominant-negative version of Orai, and we agree that this could prove complimentary to our RNAi experiments.

      The Reviewer suggests two additional genetic approaches which are well-reasoned but problematic. First, they suggest rescuing the RNAi knockdowns with overexpression approaches. In addition to requiring the generation of new, RNAi-refractory transgenes, this approach is confounded by the effects of overexpressing CRAC channel components. Orai channels exhibit highly cooperative activation by Stim, and we previously showed that epidermal Stim overexpression drove mechanical nociceptive sensitization. Although this dosage effect confounds the rescue assays, we will examine whether epidermal Stim overexpression similarly sensitizes larvae to noxious thermal inputs as we would predict from our model.

      The final experiment the Reviewer suggests – phenotypic analysis of Stim knockouts – is not possible due to the lethal phase of the mutants. Furthermore, it is not possible using traditional mosaic analysis to generate mutant epidermal clones that span the entire epidermis. Such an approach might be possible with a newly engineered FLP-out Stim allele, but generating that reagent is beyond the scope of this work. The Reviewer suggests characterization of Stim heterozygotes, but Drosophila genes rarely show strong dosage effects as heterozygotes (though we acknowledge that dosage effects can be amplified in the cases of genetic interactions), hence a negative result (no effect on heat-off responses) would not be meaningful. In principle we could test whether Stim hetorozygosity enhances effects of epidermal Stim RNAi. Although a negative result will not be telling, the experiment is straightforward, and an enhancement of the effect of Stim RNA would support the model that RNAi provides an incomplete functional knockdown of Stim. We will therefore perform this experiment and incorporate the results into the revised manuscript, pending a postitive outcome.

      To better define the contributions of Stim and Orai to heat-off responses of epidermal cells, we will incorporate results from the following new experiments into our revised manuscript:

      • We will monitor effects of epidermis-specific expression of a dominant negative form of Orai on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays).
      • We will monitor effects of epidermis-specific co-expression of Stim+Orai RNAi on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays)
      • Orai channels exhibit highly cooperative activation by Stim, therefore we will examine whether epidermal Stim overexpression increases the amplitude of heat-off responses (calcium imaging) and sensitizes larvae to noxious thermal inputs (behavioral assays) as we would predict from our model.

        Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      We appreciate the suggestion. We will add a more detailed explanation of how the behaviors were scored along with an annotated video.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      Figure 1I is described in the figure legend and we will add an in-text citation.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      The small response at 32C is not ignored, though that individual response is better understood in the context of all responses plotted in Figure 3D. We will reword the phrase “At temperature maxima below 35°C epidermal cells rarely exhibited heat-off responses” to reflect the small response that is observed at lower temperatures. We will also replace the trace in the figure – the original submission contained the one outlier sample that exhibited robust responses at 32 C.

      We appreciate the suggestion to include Fig S3 in the main text – we initially included it, but moved it to the supplement for space considerations. We will include it as a main figure in our revised submission.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      We appreciate the suggestion; we will add these traces to our revised submission.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      We note that efficacy of the knockdowns has been validated by us in acutely dissociated epidermal cells. RTPCR validation as described would require FACS-sorting of acutely dissociated, GFP-labeled epidermal cells from each specimen, an extremely time- and resource intensive experiment that provides limited information. The more relevant information is the physiological readout of Stim/Orai functional knockout using these reagents which we previously conducted. As described above, we will add a description of these experiments and the relevant citation.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      We agree with the Reviewer that this topic warrants further discussion. Pending the results of our planned experiments (Orai dominan negative, Stim+Orai RNAi), we will incorporate a discussion of other channels that may contribute to the heat-off response. We appreciate the Reviewers point that loss of SOCE in Drosophila neurons can change the expression of membrane channels – that is an intriguing possibility that might explain the modest effects of Stim or Orai knockdown. We have not investigated effects of epidermal Stim/Orai knockdown on expression of other channels, but will incorporate this possibility into our discussion.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      We will incorporate these additional details in the methods section.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      We appreciate the suggestion and will incorporate additional discussion of relevant Drosophila work on STIM and Orai.

      **Referees cross-commenting**

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      We address their comments below.

      Reviewer #1 (Significance (Required)):

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Noxious heat can have a strong adverse effect on animals, resulting in sensitization when noxious thermal stimuli are applied repeatedly. Noxious heat induces a characteristic rolling behavior in Drosophila melanogaster larvae. This study investigates sensitization, whereby a second heat stimulus evokes this behavior with significantly shorter latency (e.g., 3.4 seconds) than the initial exposure (e.g., 8.79 seconds). While prior research has implicated central and peripheral neurons in this process, recent findings in mammalian systems suggest a role for keratinocytes.

      In this manuscript, Yoshino et al. report that epidermal cells are necessary and sufficient to mediate heat sensitization in D. melanogaster larvae. Using an ex vivo epidermal imaging system, the authors demonstrate that calcium influx in epidermal cells is crucial for sensitization. Importantly, this calcium influx was observed only when the temperature was lowered from a dangerously high to a safe temperature. The calcium channel system Orai and Stim facilitates this influx.

      Major comments:

      (1) The authors clearly demonstrate the heat-off reaction using calcium influx imaging. However, all of the imaging shows the response to the first stimulation. Since the study focuses on sensitization, which shows a quicker response to the second heat stimulus, it would be helpful if the authors showed calcium influx when the second stimulus was applied. It would also be interesting to see how many times the epidermal cells can react to heat stimulation.

      We appreciate the suggestion from the Reviewer but note that the calcium influx we show occurs in epidermal cells, which signal to neurons to potentiate future responses in our model. We have emphasized this point in our revised manuscript.

      The relevant response to visualize the sensitization is the heat-evoked calcium response in nociceptors, not epidermal cells. We have verified that C4da neurons exhibit calcium responses to the warming stimulus we use in our heat-off paradigm and our preliminary studies suggest that the heat-off stimulus potentiates future responses to noxious heat in nociceptors. We will therefore examine (1) whether epidermal stimulation triggers a sensitization of nociceptors to thermal stimuli by monitoring heat-induced calcium responses using GCaMP, and (2) whether epidermal Stim and Orai are required for this sensitization.

      The second comment addresses the response of epidermal cells to repeated rounds of stimuli. We agree that this is an interesting point. We have verified that epidermal cells indeed respond to multiple rounds of heat-off stimuli. We will incorporate results from a paradigm in which epidermal cells are presented with two successive heat-off stimuli, spaced by 5 minutes to allow epidermal cytosolic calcium to return to baseline. We will incorporate new analysis examining the relative magnitude of epidermal cells to the first and second stimulus.

      (2) Figure 5 only shows one condition: a 30-second interval between the first and second heat application. While the rolling latency of the Luciferase RNAi control ranges from 4 to 12 seconds (with a median of 5 seconds), Fig. 1E shows a latency ranging from 6 to 12 seconds (with a median of 10 seconds) under the same 30-second interval conditions. This difference makes interpreting the effect of Stim and Orai confusing. The authors need to clarify whether the knockdowns accelerate the first response or delay the second response.

      The Reviewer notes that we assayed effects of Stim/Orai RNAi on heat-induced nociceptive sensitization in only one paradigm. Given the kinetics of cytosolic calcium increases following Stim or Orai RNAi in epidermal cells (Fig. 4F), we agree that an additional set of behavior experiments investing sensitization following a 60 sec recovery is warranted. For our revision we will conduct a time-course to assay requirements of epidermal Stim and Orai (using epidermal expression of Stim/Orai RNAi and Orai dominant negative transgenes) on heat-induced nociceptive sensitization. Our preliminary studies indicate that Stim and Orai RNAi significantly reduce heat-induced sensitization following 60 s of recovery (we present results from 30 s of recovery in the original submission).

      The Reviewer raises some questions about differences in behavioral latencies in Figure 1E and Figure 5B. We intentionally avoid such comparisons both because the genetic backgrounds are different and the experiments were conducted at very different times (more than 1 year apart). In both experiments the salient feature that we discuss is the presence or absence of sensitization, not the mean latency. We note that we do compare mean latency values in Figure 1B, but that was a distinct experimental paradigm (global heat of variable temperatures followed by focal noxious heat) designed specifically to define heat stimuli that generate the maximum level of sensitization. In that case, the genotype was fixed and all assays were conducted concurrently.

      Minor comments:

      (i) In Fig. 2C´´, the authors observed clear calcium influx in epidermal cells by combining the GCaMP genetic tool with an ex vivo thermal perfusion system. Although this system applies heat uniformly across the epidermal tissue, calcium influx is spatially restricted, appearing primarily in the head and tail regions of the epidermis. These results suggest that the heat-responsive epidermal cells are localized to these regions or that there are regional differences in sensitivity. The authors should explain the spatial relationship between the heat-applied epidermal cells and the occurrence of calcium influx.

      The Reviewer notes that intensity of the epidermal GCaMP signal is particularly intense in the anterior and posterior portions of the fillet preparation (Fig. 1B-1C), and we agree that it would be useful to include an explanation of this result, which is an artifact of the sample preparation.

      The specimens we use for calcium preparation are “butterfly” preparations – the body wall is filleted along the long axis with the exception of regions at the head and tail that are pinned down on sylgard plates. Hence, the regions in the head and tail contain intact tissue (including a double layer of skin when we image in widefield), not a single layer of skin (the rest of the prep). More significantly, the head and tail regions are pinned down, creating a wound that triggers lasting local calcium transients (note signal in the absence of temperature stimulus, Figure 1B’ and 1B”, 1C’). We therefore exclude this region from our analysis. We note that our behavior studies relied on stimuli presented to the abdominal segments we sample in the semi-intact calcium imaging. Similarly, we dissociated epidermal cells exclusively from these segments for imaging of acutely isolated epidermal cells.

      We do note that there is a periodicity to the signal – within each segment there are local maxima and minima of signal, and we agree with the Reviewer that this spatial segregation is an interesting point for discussion. We will add 1-2 sentences to our discussion of the result to acknowledge this point.

      (ii) Related to comment (i) above, if heat stimuli are applied topically using a heat probe under the ex vivo imaging system, how large an area reacts to the stimuli?

      The Reviewer raises an interesting question about the local response to heat stimuli. In our dissociated cell experiments we found that the overwhelming majority of isolated epidermal cells exhibit heat-off responses, and we likewise find that the majority of cells in our semi-intact preparation respond to heat-off stimuli. However, our current probe for delivering local heat stimuli is not compatible with our imaging system. We are working to incorporate an IR laser to focally deliver heat stimulus to explore whether epidermal cells signal to neighbors following stimulation, but such studies are beyond the scope of the current work.

      (iii) Providing supplementary movie(s) of the calcium live imaging would enhance the reader's understanding.

      We agree with the Reviewer that this would be a useful supplement. We will add representative movies as experimental supplements in our revised manuscript.

      (iv) The time point of the image in Fig. 2C´ ("before heat") is not the most informative for demonstrating a "heat-off" response. The authors should replace it with an image taken during the heat application to provide a more direct comparison with the post-stimulus influx shown in Fig. 2C´´.

      We appreciate the Reviewer’s suggestion and agree this would be a better choice to visually represent the change in fluorescence induced by the heat-off response. We will make this change in our revised manuscript.

      (v) The authors state that sensitization occurs "primarily in the 30-45 ºC range." However, the rolling probability and latency developed oppositely at 45 ºC stimulation than at 40 ºC. It would be doubtful that 45 ºC may be approaching a noxious or damaging threshold that engages a different phenomenon. The authors should reconsider including 45 ºC within the optimal sensitization range or provide a justification.

      We agree with the Reviewer that a more detailed discussion of the effects of temperature at the end of the range (45 C) is warranted. Exposure to a 45 C global heat stimulus triggered temporary paralysis in some larvae, and we suspect that this accounts for the apparent reduction in roll probability following the second stimulus. We can add a plot depicting the proportion of larvae that exhibited paralysis during 45 C global heat and determine whether these heat-paralyzed larvae exhibited distinct responses from larvae that were not paralyzed and provide a more detailed account of the optimal sensitization range.

      Treatment with 45 C stimuli still triggered a significant reduction in roll latency (sensitization), but we did not examine whether the latency was significantly different from what was observed at 40 C. We can add that analysis in the revision.

      (vi) In the sentence "To this end, we developed a perfusion system, that would deliver thermal ramps from ~20-45ºC ...," the tilde ~ should be replaced with "approximately".

      Noted. We will make the change.

      (vii) Throughout the manuscript, please clarify in the figure legends whether the sample size (n) refers to the number of individual animals or the number of cells.

      Noted. We will add the relevant details to our sample sizes notations.

      (viii) The Key Resources Table does not specify the wild-type (WT) strain used for the control experiments (e.g., in Fig. 1). Please provide the full genotype of the control strain used.

      We included the experimental genotypes in each figure legend, which we find more useful than the key resource table, which contains a list of all reagents used in the study (Drosophila alleles included).

      Reviewer #2 (Significance (Required)):

      General Assessment

      This study addresses a fundamental question in sensory biology: whether epidermal cells, long regarded as passive participants in somatosensation, actively contribute to noxious heat detection and avoidance behavior. While previous work has defined the neuronal circuits and TRP channel mechanisms underlying thermal nociception in Drosophila larvae, the potential sensory role of skin cells has remained largely unexplored. The authors integrate behavioral analysis with in vitro and ex vivo calcium imaging to provide a rigorous, multi-level investigation of epidermal thermosensitivity.

      Advancement

      The work advances the field by revealing that Drosophila epidermal cells are intrinsically thermosensitive and can acutely sensitize larval nociceptive responses to noxious heat through heat-off signaling. This discovery shifts the current paradigm of thermal nociception from a neuron-centric model to one that incorporates epidermal contributions, highlighting a conserved and previously underappreciated role of skin cells in active environmental sensing.

      The reviewer's expertise: Molecular genetics, developmental biology, insect physiology and endocrinology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.

      The Reviewer makes an important point. In our original experiment, the lack of response in the 10C – 30C experiment could be due to some cold-induced suppression of the off response. We have found that this is not the case – we have found that off responses following a 10C-40C ramp are indistinguishable from responses to a 20C-40C ramp. In our revised manuscript we will incorporate new results showing epidermal heat off responses to a 10C-40C ramp as well as normalization to 20C-40C responses performed in parallel.

      Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.

      We found that epidermal cells exhibited minimal responses to warming stimuli, as would be expected for the epidermally expressed TRP channel TRPA1. In addition, the heat-off response we identified was remarkably similar to characteristic heat-off responses of mammalian CRAC channels. Hence, we focused our attention on the Orai pathway. While we agree that contributions of TRP channels could be of interest, especially if our additional analyses (double RNAi and Orai Dominant Negative) support the model that additional channels likely contribute to the heat-off response, the characteristic temperature responses of CRAC channels made them the most plausible candidate.

      In parallel to the experiments to further characterize Stim/Orai contributions to the heat-off response, we will assay requirements of TRPA1 to heat-induced nociceptor sensitization.

      While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.

      We addressed the question of knockdown efficiency above, and agree that testing the effects of Orai RNAi and Stim RNAi in combination is worthwhile. We detailed our plans for these experiments above.

      The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.

      In our original submission we stated “Third-instar larvae (96-120 AEL) larvae were used in all experiments” We provide additional details on the staging of larvae for all experiments in the methods section of our revised submission. To synchronize cultures, embryos were collected from experimental crosses for 24 h, aged for 96 h, and foraging mid-third instar larvae (96-120 h old) were used for all experiments.

      Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      We provide additional details on the application of global heat stimulus in the methods section of our revised manuscript. “For assays testing effects of varying the temperature of prior thermal stimuli on thermal nociception, larvae were individually transferred to a pre-warmed Peltier plate (11 x 7 cm; Torrey Pines Scientific). Peltier plates were warmed to the indicated temperatures, a thin layer of water was applied to the surface using a paint brush, and the temperature was verified using an infrared thermometer. Larvae were transferred individually to the Peltier plate, incubated for the indicated time, and recovered to 2% Agar Pads using a paint brush. Following 10 s of recovery, larvae were stimulated with a 41.5°C thermal probe, as above, and latency to the first complete roll was recorded.”

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.

      We thank the reviewer for identifying the discrepancy. This inconsistency has been corrected in the revised submission.

      Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.

      We will incorporate representative traces for the heat-off responses plotted in Figure 1E.

      A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."

      We thank the reviewer for identifying the omission. The period has been added.

      In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      This has been corrected in the revised submission.

      Reviewer #3 (Significance (Required)):

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.
      2. Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.
      3. While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.
      4. The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.
      5. Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.
      2. Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.
      3. A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."
      4. In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      Significance

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Noxious heat can have a strong adverse effect on animals, resulting in sensitization when noxious thermal stimuli are applied repeatedly. Noxious heat induces a characteristic rolling behavior in Drosophila melanogaster larvae. This study investigates sensitization, whereby a second heat stimulus evokes this behavior with significantly shorter latency (e.g., 3.4 seconds) than the initial exposure (e.g., 8.79 seconds). While prior research has implicated central and peripheral neurons in this process, recent findings in mammalian systems suggest a role for keratinocytes. In this manuscript, Yoshino et al. report that epidermal cells are necessary and sufficient to mediate heat sensitization in D. melanogaster larvae. Using an ex vivo epidermal imaging system, the authors demonstrate that calcium influx in epidermal cells is crucial for sensitization. Importantly, this calcium influx was observed only when the temperature was lowered from a dangerously high to a safe temperature. The calcium channel system Orai and Stim facilitates this influx.

      Major comments:

      (1) The authors clearly demonstrate the heat-off reaction using calcium influx imaging. However, all of the imaging shows the response to the first stimulation. Since the study focuses on sensitization, which shows a quicker response to the second heat stimulus, it would be helpful if the authors showed calcium influx when the second stimulus was applied. It would also be interesting to see how many times the epidermal cells can react to heat stimulation.

      (2) Figure 5 only shows one condition: a 30-second interval between the first and second heat application. While the rolling latency of the Luciferase RNAi control ranges from 4 to 12 seconds (with a median of 5 seconds), Fig. 1E shows a latency ranging from 6 to 12 seconds (with a median of 10 seconds) under the same 30-second interval conditions. This difference makes interpreting the effect of Stim and Orai confusing. The authors need to clarify whether the knockdowns accelerate the first response or delay the second response.

      Minor comments:

      (i) In Fig. 2C´´, the authors observed clear calcium influx in epidermal cells by combining the GCaMP genetic tool with an ex vivo thermal perfusion system. Although this system applies heat uniformly across the epidermal tissue, calcium influx is spatially restricted, appearing primarily in the head and tail regions of the epidermis. These results suggest that the heat-responsive epidermal cells are localized to these regions or that there are regional differences in sensitivity. The authors should explain the spatial relationship between the heat-applied epidermal cells and the occurrence of calcium influx.

      (ii) Related to comment (i) above, if heat stimuli are applied topically using a heat probe under the ex vivo imaging system, how large an area reacts to the stimuli?

      (iii) Providing supplementary movie(s) of the calcium live imaging would enhance the reader's understanding.

      (iv) The time point of the image in Fig. 2C´ ("before heat") is not the most informative for demonstrating a "heat-off" response. The authors should replace it with an image taken during the heat application to provide a more direct comparison with the post-stimulus influx shown in Fig. 2C´´.

      (v) The authors state that sensitization occurs "primarily in the 30-45 ºC range." However, the rolling probability and latency developed oppositely at 45 ºC stimulation than at 40 ºC. It would be doubtful that 45 ºC may be approaching a noxious or damaging threshold that engages a different phenomenon. The authors should reconsider including 45 ºC within the optimal sensitization range or provide a justification.

      (vi) In the sentence "To this end, we developed a perfusion system, that would deliver thermal ramps from ~20-45ºC ...," the tilde ~ should be replaced with "approximately".

      (vii) Throughout the manuscript, please clarify in the figure legends whether the sample size (n) refers to the number of individual animals or the number of cells.

      (viii) The Key Resources Table does not specify the wild-type (WT) strain used for the control experiments (e.g., in Fig. 1). Please provide the full genotype of the control strain used.

      Significance

      General Assessment

      This study addresses a fundamental question in sensory biology: whether epidermal cells, long regarded as passive participants in somatosensation, actively contribute to noxious heat detection and avoidance behavior. While previous work has defined the neuronal circuits and TRP channel mechanisms underlying thermal nociception in Drosophila larvae, the potential sensory role of skin cells has remained largely unexplored. The authors integrate behavioral analysis with in vitro and ex vivo calcium imaging to provide a rigorous, multi-level investigation of epidermal thermosensitivity.

      Advancement

      The work advances the field by revealing that Drosophila epidermal cells are intrinsically thermosensitive and can acutely sensitize larval nociceptive responses to noxious heat through heat-off signaling. This discovery shifts the current paradigm of thermal nociception from a neuron-centric model to one that incorporates epidermal contributions, highlighting a conserved and previously underappreciated role of skin cells in active environmental sensing.

      The reviewer's expertise: Molecular genetics, developmental biology, insect physiology and endocrinology.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      Referees cross-commenting

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      Significance

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • *

      __Reviewer #1 __


      Major comments


      1. The manuscript posits that the loss of function of MASh components (Ogc1 and Aralar) decreases adrenergic-stimulated lipolysis by altering the cytosolic NAD⁺/NADH ratio, with AMPK/ACC mentioned as possible mediators. However, this remains speculative. Please provide mechanistic data directly linking MASh-dependent NAD⁺/NADH changes to the regulation of lipolysis in brown adipocytes during adrenergic stimulation. Answer 1) The reviewer raises an important point regarding the direct assessment of cytosolic NAD⁺/NADH redox changes as a mechanistic link for altered lipolysis in brown adipocytes lacking MASh components. To address this point, we added new data to the revised manuscript showing lactate/pyruvate ratio as measured by metabolomics. This is a well-established surrogate marker to monitor changes in redox balance. Notably, under basal (non-stimulated) conditions, the lactate/pyruvate ratio did not display any significant differences between Aralar 1 KD and control cells, suggesting preservation of cytosolic NAD⁺/NADH levels in the absence of functional MASh under these conditions. This finding is consistent with reports showing the robustness of NAD⁺ regeneration via multiple shuttles and the possibility of metabolic compensation when one shuttle is compromised (PMID: 40540398; PMID: 37647199).

      The results have been added as new supplementary Figure 1 as following:

      Our new metabolomics data also revealed substantial reductions in the aspartate/glutamate ratio in Aralar 1 knockdown cells, serving as a metabolomic signature of impaired MASh function and reduced exchange of these amino acids between the cytosol and mitochondria. Given that the MASh is a major mechanism for exporting cytosolic reducing equivalents into the mitochondria under high metabolic demand, its loss would be expected to impact redox homeostasis, particularly under adrenergic stimulation when glycolytic flux and lipolytic activity are elevated (PMID: 40540398).

      Importantly, although our redox surrogate marker did not detect alterations, this may be explained by activation of compensatory pathways, most notably the glycerol phosphate shuttle (GPSh), which is highly expressed and active in brown adipocytes. Indirect support for this compensation comes from data shown in figure 4I showing reduced glycerol release in Aralar 1 KD cells upon norepinephrine stimulation and blocked lipolysis. This suggests a redirection of glycolytically derived G3P away from release and toward enhanced cycling within the GPSh, supporting cytosolic NAD⁺ regeneration via mitochondrial FAD-dependent G3PDH and cytosolic NAD⁺-dependent G3PDH activity. This is consistent with studies documenting that the combined action of MASh and GPSh maintains NAD redox homeostasis in brown adipocytes especially during non-thermogenic conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). We have included a discussion about this possibility at page 9, third paragraph as follows:

      *“Previous studies have shown that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (mG3PDH), which functions as an electron sink to sustain low cytosolic NADH levels essential for continuous glycolytic flux [11]. Accordingly, suppression of the MASh, either genetically or pharmacologically, is likely to induce a compensatory upregulation of the GPSh. This adaptation would enhance G3P turnover, contributing to the maintenance of cytosolic NAD redox balance. Moreover, the increased flux through the GPSh could favor fatty acid esterification and triglyceride synthesis or re-esterification, consistent with our findings in Ogc and/or Aralar 1 KD cells, where (i) triglyceride content rises (Fig. 3), (ii) overall respiratory rates remain largely unaltered (Figs. 2D–G), and (iii) glycerol release declines significantly (Fig. 4I). Notably, the decrease in glycerol release persists even when lipolysis is blocked by ATGlistatin, suggesting that the available G3P pool is rerouted from dephosphorylation and extracellular release toward oxidation to DHAP by mG3PDH to regenerate cytosolic NAD+ under MASh-deficient conditions. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis. These results support the notion that, even during adrenergic stimulation—when long-chain unsaturated fatty acids and their CoA esters strongly inhibit mG3PDH activity [11]—the residual flux through the glycerophosphate shuttle remains critical for sustaining cytosolic NAD redox equilibrium [11,19,32].” *

      • *

      At the mechanistic level, adrenergic stimulation in brown adipocytes activates robust lipolysis and thermogenic gene programs, generating high NADH that must be efficiently reoxidized to sustain flux through glycolysis and lipolysis-linked pathways. Our findings are consistent with a model in which the loss of MASh does not prevent cytosolic NAD⁺ regeneration or lipolytic flux during acute adrenergic stimulation, due to compensatory upregulation of the GPSh, as suggested by the glycerol release changes. Thus, while MASh normally acts as a conduit for NADH export and aspartate/glutamate exchange, in its absence, the GPSh maintains cytosolic redox balance, thereby sustaining glycolytic and lipolytic capacity.

      We agree that future studies should employ direct measurements of cytosolic NAD⁺/NADH ratios (e.g., genetically-encoded redox sensors) during adrenergic stimulation and specific pharmacological inhibition of both shuttles to dissect these relationships in greater detail. We sincerely appreciate the reviewer's input, which has prompted us to clarify the indirect but robust evidence supporting a role for compensatory redox shuttle activity in preserving brown adipocyte lipolysis in the setting of MASh impairment.

      We have further added a new paragraph in the discussion section (page 10)::

      *“Mechanistically, the connection between the MASh and lipolysis appears to involve regulation of the cytosolic NAD⁺/NADH redox balance. MASh activity facilitates the regeneration of NAD⁺ from NADH in the cytosol, primarily through the reduction of oxaloacetate to malate by cytosolic malate dehydrogenase (Fig. 1G-H). Despite the theoretical expectation that reductions in MASh activity would disturb redox homeostasis, our metabolomic data show that the lactate/pyruvate ratio remains unchanged under conditions of MASh impairment, indicating that the overall cytosolic NAD⁺/NADH ratio is maintained (Figure S1A-C). While direct measurements of cytosolic NAD⁺/NADH were not performed, the preserved lactate/pyruvate ratio in Aralar 1 KD cells under basal conditions strongly suggests redox stability, likely due to compensatory activity by alternative mitochondrial shuttles or metabolic adaptations that maintain NAD redox homeostasis despite MASh impairment [18,33]. *

      Previous evidence indicates that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (G3PDH), which acts as an electron sink to sustain low cytosolic NADH levels critical for glycolysis [34]. In this sense, it is conceivable that genetic or pharmacological suppression of MASh triggers compensatory enhancement of the G3P shuttle, increasing G3P availability and facilitating the maintenance of cytosolic NAD redox balance. This adaptation could also promote fatty acid esterification and triglyceride synthesis or re-esterification, aligning with our observations that in Ogc and/or Aralar 1 KD cells: (i) triglyceride levels increase (Fig. 3); (ii) overall respiratory rates are preserved (Figs. 2D–G); and (iii) glycerol release is significantly reduced (Fig 4I).”

      • *

      __ The absence of in vivo analysis of lipid-droplet size in MASh loss-of-function models is a major concern. In vitro results could be confounded by differences in differentiation stage between groups. Please document equivalent adipogenesis across groups (e.g., Pparg/Cebpa/Plin1/Fabp4 expression).__

      Answer 2) We thank the reviewer for the thoughtful and constructive comment regarding potential confounding by differences in differentiation stage, and for highlighting the importance of documenting equivalence between experimental groups. We appreciate the opportunity to clarify and provide additional assurance on this point.

      As detailed in our manuscript, we have performed qPCR analysis of multiple well-established markers of brown adipocyte differentiation, including Ucp1, Elovl3, Prdm16, Pparg, Cebpa, Plin1, and Fabp4, in both scramble, aralar1 KD, and Ogc KD cells (see Fig. S1A and accompanying text). Our results show no apparent effect of these genetic interventions on overall differentiation, as the expression levels of these key markers were consistently unaltered across groups. Furthermore, adenoviral-mediated knockdown of Ogc achieved an approximate 80% reduction in Ogc mRNA (see Fig. S1B), yet most differentiation markers remained unaffected. We did observe significant increases in Atgl, Pgc1α, and Tfam mRNA levels, which may indicate a degree of pathway reprogramming without affecting the general differentiation profile. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis.

      Additional experimental support for equivalent differentiation can be drawn from our respirometry data presented in Figures 2E and 2G. These figures demonstrate that respiratory rates upon norepinephrine stimulation, which is a sensitive indicator of brown adipocyte thermogenic capacity, were essentially identical in scramble, aralar1 KD, and Ogc KD cells. Since norepinephrine-stimulated respiration requires both functional mitochondria and the full differentiation of brown adipocytes, these results strongly support the conclusion that silencing either MASh component does not impair the fundamental ability of cells to undergo brown adipocyte differentiation or achieve functional thermogenic competence.

      This is consistent with published findings showing that norepinephrine triggers robust respiration and thermogenic activation only in fully differentiated and functional brown adipocytes, making such measurements a widely accepted proxy for differentiation status and mitochondrial integrity. Thus, the equivalent respiratory responses observed in all groups further validate that differentiation was not compromised by the genetic interventions.

      We hope this clarifies that equivalent adipogenesis was carefully documented and that any observed phenotypes are unlikely to be attributable to differences in differentiation stages. Thank you again for your rigorous assessment and for helping to ensure the robustness of our study.

      __ Please include rescue experiments (add-back OGC1 and Aralar) to rule out siRNA/shRNA off-target effects and verify that the phenotype stems from MASh loss of function.__

      Answer 3) We thank the reviewer for this important suggestion regarding the inclusion of rescue experiments with add-back of Ogc and Aralar to definitively exclude off-target effects of the siRNA/shRNA-mediated knockdowns.

      We would like to kindly point out that although we did not perform add-back rescue experiments directly, the consistency of phenotypes observed across two independent genetic interventions—aralar 1 KD and Ogc KD—strongly argues against off-target effects being responsible for the observed metabolic and functional alterations. Specifically, both knockdowns yielded remarkably similar phenotypes in multiple assays, including respirometry analyses, mitochondrial morphology, lipid droplet homeostasis, and lipid metabolism, supporting the conclusion that these effects stem from MASh loss of function rather than nonspecific silencing.

      Furthermore, our new supplementary data (new Supplementary Figure 1A-F) reveals a significant reduction in the aspartate/glutamate ratio in Aralar 1 KD cells, a compelling functional readout for MASh impairment. This molecular evidence corroborates that our genetic interventions effectively disrupted MASh activity as intended.

      We sincerely appreciate the reviewer’s thorough evaluation and understand the importance of rescue experiments. While recognizing their value, we believe the convergent genetic, metabolic, and functional evidence presented across two different MASh components provides strong and consistent support that the phenotypes observed are due to specific loss of MASh function.


      __ Please expand on physiological significance: What is the importance of MASh regulation of BAT lipolysis in long-term adaptive thermogenesis?__

      Answer 4) This is a very interesting aspect, and we have included a new paragraph in the discussion section (page 14) to address it as follows:

      “Our results, supported by recent literature, strongly indicate that the malate–aspartate shuttle (MASh) plays a key role in facilitating fatty acid–dependent thermogenesis in brown adipocytes. Specifically, BAT-targeted overexpression of GOT1 has been shown to enhance β-oxidation and support acute cold-induced thermogenesis (PMID: 40540398). Interestingly, genetic ablation of GOT1—and thus MASh inhibition—preserves cold-induced thermogenesis by promoting a metabolic shift from fatty acid to glucose oxidation. Our findings corroborate and extend these observations by demonstrating that MASh impairment sustains overall respiratory activity in norepinephrine-stimulated brown adipocytes (Figures 2D–2G), while concurrently impairing lipolysis and resulting in an accumulation of small lipid droplets (Figures 3 and 4). Collectively, these data suggest that MASh not only modulates substrate preference towards fatty acid oxidation but also facilitates lipolysis, an essential upstream step that enables lipid oxidation and supports thermogenic heat production.”

      Minor comments

      1. __ Fig. 4 legend/title contains a typo ("lypolysis" → lipolysis).__ Answer 1) Corrected

      __ In Fig. 2 legend line: "Adevirus-mediated" → Adenovirus-mediated; "OCAR" → OCR.__

      Answer 2) Corrected

      __ For lipolysis imaging, you already show Forskolin/Atglistatin/Etomoxir controls; add a vehicle-only time course overlay in the main figure (currently in text/legend) to aid visual comparison.__

      Answer 3) We thank the reviewer for pointing this out. To improve clarity, we have updated the labeling in Figures 3 and 4: “basal” now clearly refers to the unstimulated/untreated condition, and the previously labeled “UT” condition has been clarified as “untransduced.” These changes make the figure legends and data presentation more consistent and easier to interpret.

      __ Ensure consistent gene symbols (Atgl/Pnpla2), and protein capitalization.__

      Answer 4) Corrected.

      __Reviewer #2 __

      Major points:

      1. __ In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.__ Answer 1) We thank the reviewer for this insightful comment. Indeed, TMRE is a membrane potential-sensitive dye and could therefore potentially affect measurements of mitochondria.

      We would like to point out that mitochondrial morphology was quantified based on mitochondrial area rather than fluorescence intensity. To create an accurate binary map of mitochondria, we used a low threshold, which allowed us to include even weakly stained mitochondria and thereby detect them independently of their membrane potential. In all imaged cells, TMRE signal was sufficient to reliably identify mitochondrial pixels. Moreover, these images were acquired using a confocal microscope, where the risk of pixel expansion due to higher fluorescence intensity is minimized. Lastly, given that overall mitochondrial oxygen consumption in these cells remains largely intact, we do not expect a substantial loss of membrane potential, although minor effects cannot be entirely excluded.

      We opted to use TMRE for imaging Ogc KD cells because the scramble control for these shRNA viruses carries an mKate fluorescent tag, which overlaps with the MTDR signal. Since accurate assessment of transduction efficiency relied on detecting mKate, MTDR could not be used in these experiments. Importantly, we only compare mitochondrial morphology within the same staining condition and do not draw conclusions across cells stained with different dyes.

      To ensure transparency, we have added a new section at the discussion (page 17, 2nd paragraph) highlighting the potential influence of ΔΨm-dependent dyes on morphological measurements as follows:

      “It is also important to note that mitochondrial morphology was quantified using MTDR in Aralar 1 KD cells and TMRE in Ogc KD cells due to experimental constraints (see Methods). TMRE is a membrane potential–dependent dye, which could potentially influence morphology measurements. To minimize this risk, we used confocal microscopy, which reduces the likelihood of pixel expansion due to higher fluorescence intensity, and set thresholds to detect even weakly stained mitochondria. Nonetheless, we cannot fully exclude the possibility that the differences in morphology observed between Aralar 1 and Ogc KD are influenced by the use of different dyes; however, statistical comparisons were never performed across samples stained with different dyes.”

      Also, we have expanded the Methods section (page 22, 2nd paragraph) to include a rationale for using these dyes and describe the analysis protocol as following:

      “TMRE was used for Ogc KD cells because the scramble control for the shRNA viruses carries an mKate fluorescent tag, which overlaps with MTDR fluorescence, preventing its use. MTDR was used for Aralar KD cells. Image Analysis was performed in FIJI (ImageJ, NIH). For the quantification of mitochondrial morphology and area, images stained with TMRE or MTDR were analyzed. Thresholds were adjusted to ensure that even weakly stained mitochondria were detected and included in the analysis. Only the mitochondrial area was evaluated, independent of fluorescence intensity.”

      Minor points:

      1. __ In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.__ Answer 1) We have substantially changed this paragraph in the revised manuscript to better explain why LDH would not act as a major player in contributing to NAD redox balance in the context of BAT thermogenesis, as follows:

      “In mammalian cells, cytosolic NAD⁺ is regenerated through lactate dehydrogenase (LDH), the glycerol-3-phosphate shuttle (GPSh), or the malate-aspartate shuttle (MASh). In BAT, however, lactate production rises only slightly with adrenergic activation and most lactate is oxidized via the TCA cycle, suggesting that LDH primarily consumes NAD⁺ rather than regenerating it [PMID: 30456392; PMID: 37337122; PMID: 30456392; PMID: 37802078; PMID: 40982723]. Consequently, mitochondrial redox shuttles become critical for sustaining cytosolic NAD⁺ supply”.

      We have also provided additional references to support this new section at the introduction.

      __ In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly. __

      Answer 2) We thank the reviewer for this comment. We would like to clarify that Figure 1A is a schematic overview of the system, while Figures 1B–D show protein expression in specific contexts: whole BAT (B), whole liver (C), and BAT mitochondria (D). In Figures 1B and 1C, all components are shown because both cytosolic (MDH1 and GOT1) and mitochondrial proteins (MDH2, GOT2, Aralar 1 and 2 and OGC) are present. In contrast, Figure 1D shows only mitochondrial components (OGC, Aralar1, MDH2, and GOT2). Although Aralar2 is a mitochondrial protein, it was not detected in this study (Forner et al., 2009). Similarly, cytosolic components such as MDH1 and GOT1 are not shown in Figure 1D because they are absent in the mitochondrial fraction. We have revised the figure legend to make these distinctions clearer.

      __ In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.__

      Answer 3) We thank the reviewer for catching this and allowing us to correct our mistakes. In the revised version, we have corrected the figure legend of Supplementary Figure 1 so that the number of n matches the data points shown.

      __ Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results. __

      Answer 4) We thank the reviewer for this important comment and apologize for the lack of detail regarding this analysis. The analysis of BODIPY-C12 and BODIPY-493 was performed by quantifying the mean fluorescence intensity of BODIPY-C12 detected within a mask generated from the BODIPY-493 signal. This approach allowed us to define all lipid droplets and measure the release of previously esterified C12. To account for variability across samples, the data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline. In the revised manuscript we have included this description in the Methods section (page 18, last paragraph) for clarity and reproducibility, as following:

      “Lipid Droplet area was defined based on Bodipy 493/503 signal, which was used to generate a mask identifying all lipid droplets. Within this mask, the mean fluorescence intensity of BODIPY C12 was quantified over time to monitor the release of previously esterified C12. To account for variability between samples, data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline.”

      __ The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL. __

      Answer 5) We thank the reviewer for this constructive comment. We have clarified these aspects in the revised Results and Discussion sections to reflect this interpretation more accurately as follows:

      “Notably, Atgl mRNA measurement in our study was primarily used as a marker of brown adipocyte differentiation, rather than as a direct indicator of ATGL protein abundance or enzymatic activity. We detected increased Atgl expression only in Ogc KD cells (Fig. S1H), but not in Aralar 1 KD cells (Fig. S1G). This likely does not reflect a major difference in differentiation status, as other brown adipocyte markers and norepinephrine-stimulated respiration were comparable between scramble and knockdown cells (Fig. 2D-G and 2N-O and S1G-H). Although lipolysis was not evaluated in Ogc KD cells, in Aralar 1 KD cells basal lipolysis remained unchanged (Fig. 4D-E and 4G-I), whereas norepinephrine-stimulated lipolysis was delayed or partially inhibited. Notably, the enhanced fatty acid esterification observed in Ogc KD cells despite elevated Atgl expression is not contradictory, since in brown adipocytes lipolysis and re-esterification occur concurrently to sustain high lipid turnover [34].

      __ Red-on-black is not a great color code for IMFs, how about black-and-white? __

      Answer 6) We have changed color text for white on figures 2H and K as suggested.

      __Reviewer #3 __

      Major points;

      1. __ Although in the manuscript Veliova and coworkers demonstrated that MAS is functional in brown adipocytes showing kinetic parameters equivalent to that previously described in other tissues, surprisingly, when its components are downregulated, no effect, or very little, on mitochondrial respiration is found (figure 2). This is an intriguing result since MAS disruption has been widely reported to impair respiration in different cell types and tissues. However, since no direct evidence of MAS dysfunction is provided, it is possible that MAS may still remain partially or fully functional under the conditions used by the authors, and therefore this point needs to be clarified to validate these results.__ Answer 1) We thank the reviewer for the insightful comment and the opportunity to clarify these important points regarding MASh dysfunction validation in our study. We acknowledge the reviewer’s observation that mitochondrial respiration was largely unaffected by MASh component knockdown, which is indeed intriguing. Importantly, as already indicated in our responses to Reviewer 1, we have provided new data showing direct molecular evidence of MASh impairment through substantial reductions in the aspartate/glutamate ratio in Aralar 1 KD cells (new Supplementary Figure S1F). This ratio is a well-established functional readout reflecting MASh activity and amino acid exchange between cytosol and mitochondria, as demonstrated in original experimental studies of MASh function in multiple tissues including brown adipocytes (PMID: 4436323). The reduction in the aspartate/glutamate ratio directly confirms loss of MASh functionality even though respiratory rates remained unchanged, likely due to metabolic compensation by robust glycerol phosphate shuttle (GPSh) activity, as further supported by our data showing reduced glycerol release upon norepinephrine stimulation in Aralar 1 KD cells cells (Figure 4I). This metabolic rerouting maintains cytosolic NAD⁺ regeneration and partially preserves respiration and energy metabolism under these experimental conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). Thus, the combination of metabolomic, respirometry, and functional lipid data strongly indicates that MASh activity was disrupted specifically and effectively by our genetic interventions. This molecular evidence was already signposted in our original manuscript and responses, underscoring that MASh loss of function—and not residual or compensatory MASh activity—is responsible for the phenotypes reported. We greatly appreciate the reviewer’s insightful attention to this critical mechanistic issue and hope this provides clear reassurance that MASh impairment was indeed achieved and functionally validated within our study framework.

      Furthermore, strategies used to downregulate MAS components produce only a partial reduction in mRNA levels, about 70 %, but its outcome on protein levels has not been determined. and the remaining protein level could be sufficient to maintain shuttle activity. Therefore, the effect of silencing at protein level should be analyzed, because as authors also point out on page 16; "mRNA levels may not reflect actual protein levels or activity".

      Answer 2) We thank the reviewer for this important point. Our knockdowns resulted in ~70–80% reduction in mRNA levels. While not complete, this represents a substantial decrease and is sufficient to produce strong functional effects. At the time the experiments were performed, we did not have access to suitable antibodies, and the available antibodies did not provide reliable signals in our samples, which is why we used qPCR to estimate knockdown efficiency. Importantly, we observed clear phenotypic changes in both knockdowns (Aralar and OGC), and both showed very similar phenotypes. This suggests that the level of knockdown was sufficient to significantly impair MAS activity. In the revised version we added new data which further validated the functional impact of Aralar KD (given that this protein has an alternative isoform, as pointed out by the reviewer). We performed metabolomics experiments measuring aspartate and glutamate levels. Our new data shows that the aspartate-to-glutamate ratio is significantly reduced in Aralar KD cells. This ratio serves as a proxy for glutamate catabolism, and the observed decrease suggests reduced glutamate catabolism, likely due to impaired MAS activity. Therefore, the reduced whole-cell aspartate/glutamate ratio serves as a metabolic signature of MAS impairment, consistent with Aralar KD. These data indicate that Aralar is sufficiently downregulated to produce a functional effect, supporting our conclusion that MAS activity is impaired. The results have been added as new supplementary Figure 1 as follows:

      __ In the case of aspartate/glutamate carriers (AGCs) the role of citrin/slc25a13, the second AGC paralog, should also be analyzed. This AGC isoform is discarded based on proteomic data from brown adipose tissue, but, as it is shown in figure 1B, its levels are similar those of Aralar/slc25a12, the only AGC silenced. Besides, primary brown adipocytes differentiated for 7 days are used here, and it is possible that factors such as culture conditions or differentiation itself could alter AGC levels. Therefore, it is necessary to determine the protein levels of citrin/AGC2, and, if necessary, downregulate it together with the Aralar/AGC1 isoform. citrin/AGC2 activity may be responsible for the observed difference between the OGC and Aralar/AGC1 KD adipocytes.__

      Answer 3) We thank the reviewer for this important point. We chose Aralar1 because it is the isoform predominantly expressed in brown adipose tissue (PMID: 23436904). We acknowledge, however, that compensatory increases in Citrin/AGC2 upon Aralar1 knockdown are possible. To address this, we have included new metabolomics data in the revised manuscript (added as Supplementary Figure 1), which provides additional support that downregulation of Aralar1, even if not complete, is sufficient to cause a metabolic change reflected by a reduced aspartate/glutamate ratio in these cells. This functional change supports that the knockdown of Aralar1 alone is sufficient to study its role in brown adipocytes, although minor compensation by Citrin/AGC2 cannot be entirely excluded.

      To address this explicitly, we have added a paragraph to the discussion (page 13, 2nd paragraph) highlighting the potential for partial compensation by Citrin/AGC2 and explaining why the observed metabolic effects are still attributable to Aralar 1 knockdown, as follows:

      “Phenotypes observed in Aralar 1 KD cells closely resemble those in Ogc KD cells, particularly in terms of lipid metabolism alterations and energy expenditure. The main difference lies in mitochondrial morphology, which is altered in Ogc KD cells but remains unchanged in Aralar 1-silenced cells (Fig. 2J,M). Unlike Ogc, which lacks an alternative isoform, Aralar 1 has a paralog Aralar 2 (Citrin, or SLC25A13) that may partially compensate for its loss. This potential compensation might explain the preservation of mitochondrial morphology in Aralar 1 KD cells. Nonetheless, our metabolomics data demonstrate that downregulation of Aralar 1 alone significantly reduces the aspartate/glutamate ratio (Fig. S1D-F). Since this ratio reflects glutamate catabolism, its decrease indicates impaired malate-aspartate shuttle activity and reduced glutamate catabolism. Therefore, although compensation by Aralar 2 cannot be entirely excluded, Aralar 1 KD alone suffices to cause substantial impairment of malate-aspartate shuttle function”.

      • *

      __ OGC and Aralar/AGC1 silencing is associated with the accumulation of smaller lipid droplets and impaired norepinephrine-induced lipolysis, but no mechanistical evidence is provided. The authors discuss a role for AMPK signaling associated with the redox unbalance generated by MAS disfunction but neither of them is proven.__

      Answer 4) We thank the reviewer for this insightful question, which was also raised by Reviewer 1 (see Reviewer 1, Question 1 above). Here, we aim to clarify the mechanistic basis by which MASh may regulate lipolysis in BAT in a complementary and refined manner.

      Our new data directly addresses this issue by examining cytosolic redox status through the lactate/pyruvate ratio, a well-established indicator of NAD⁺/NADH balance. Under basal conditions, Aralar 1 KD cells showed no change in this ratio compared to controls, indicating preserved cytosolic NAD⁺ regeneration despite reduced MASh activity. This observation is consistent with previous studies demonstrating the resilience of cellular redox homeostasis through overlapping NAD⁺-regenerating systems (PMID: 40540398; PMID: 37647199). The new results are shown in Supplementary Figure 1.

      At the same time, we detected a marked decrease in the aspartate/glutamate ratio in Aralar 1 KD cells, confirming impaired MASh function and reduced amino acid exchange between cytosol and mitochondria. The lack of redox imbalance likely reflects compensatory mechanisms, most notably the GPSh, which is highly active in brown adipocytes. Supporting this view, Aralar 1 KD cells displayed significantly reduced glycerol release upon norepinephrine stimulation (Fig. 4I), suggesting enhanced metabolic cycling of G3P through mitochondrial and cytosolic G3PDH, thereby sustaining NAD⁺ regeneration and redox equilibrium.

      We therefore propose that, although MASh normally facilitates NADH export and aspartate/glutamate exchange, its loss activates GPSh-mediated compensation that preserves cytosolic NAD⁺/NADH balance and maintains lipolytic flux during adrenergic stimulation. These findings refine our mechanistic understanding of how redox shuttle interplay supports glycolytic and lipolytic processes in BAT. Future studies employing NAD⁺/NADH sensors and simultaneous blockade of both shuttles will be essential to dissect these compensatory mechanisms in greater detail.

      Minor points;

      1. __ Is pyruvate present in respiration medium? If so, no effect on respiration is expected as pyruvate reverses the respiratory defects caused by MAS inactivation. __ Answer 1) Thanks for this important insight. In fact, as indicated in the methods section (page 17, last paragraph) all respirometry experiments were carried out in the absence of pyruvate in the media. Therefore, preserved overall respiratory rates in Aralar 1 and Ogc KD cannot be explained by compensatory pyruvate oxidation present in the media.

      __ In figure 4, only data from Aralar KD cells in relation to norepinephrine-stimulated lipolysis are shown. What happens when OGC is silenced? __

      Answer 2) This is a very interesting and relevant question. We did not perform the norepinephrine-stimulated lipolysis experiments in Ogc-silenced cells, since in most of the other experiments presented in the manuscript Ogc and Aralar 1 silencing converged to very similar, if not identical, phenotypes. Based on these consistent overlaps, we anticipate that Ogc KD would likely lead to comparable effects on lipolysis as observed in Aralar 1 KD cells. Nonetheless, we fully agree that direct assessment of lipolysis upon Ogc KD would strengthen this conclusion, and we consider this an important aspect for future studies.

      __ Nomenclature used for mitochondrial carriers is confusing. Please do not use OGC1 as there is only one isoform. Furthermore, different names for OGC are used in the manuscript; oxoglutarate carrier, malate-ketoglutarate carrier or OGC1/SLC25A11. In the case of citrin/AGC2, Aralar2 is used and is a uncommon designation.__

      Answer 3) We corrected all OGC naming in the revised manuscript. We also changed “aralar 2” for “citrin” since this was more commonly used in the literature.

      __ Some panels of figures 3 and 4 should be improved. Panels 3J, 3L and 4G are difficult to see. In panel 3J please clarify UT line from untreated/NE, are they not transduced? No equivalents conditions are assayed in Aralar KD and OGC KO cells.__

      Answer 4) We thank the reviewer for giving us the opportunity to improve this figure and apologize for the confusing labeling. In the revised version, we have clarified the labels in panels 3J, 3L, and 4G to improve visibility, and we have added descriptions of all abbreviations to the figure legends, accordingly.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Veliova and coworkers explore the contribution of the malate-aspartate NADH shuttle (MAS) to energy metabolism in brown adipose tissue. This work done by a group expert in mitochondrial metabolism, continues an interesting previous one (Veliova, 2020) where it was shown that the inhibition of the mitochondrial pyruvate carrier caused an increase in energy expenditure mediated by the activation of MAS in BAT. Here, the authors have explored the consequences of the lack of MAS activity on BAT metabolism by the silencing of the metabolite transporters that are part of MAS in cultured primary brown adipocytes. Using this loss-of-function approach, the role for MAS in the regulation of lipid homeostasis in BAT is analyzed. The results could be interesting, but in my opinion, they are not sufficiently proven. Much more evidence should be provided to confirm MAS deficiency and the mechanisms involved in the alteration of lipid homeostasis.

      Major points

      1. Although in the manuscript Veliova and coworkers demonstrated that MAS is functional in brown adipocytes showing kinetic parameters equivalent to that previously described in other tissues, surprisingly, when its components are downregulated, no effect, or very little, on mitochondrial respiration is found (figure 2). This is an intriguing result since MAS disruption has been widely reported to impair respiration in different cell types and tissues. However, since no direct evidence of MAS dysfunction is provided, it is possible that MAS may still remain partially or fully functional under the conditions used by the authors, and therefore this point needs to be clarified to validate these results. Furthermore, strategies used to downregulate MAS components produce only a partial reduction in mRNA levels, about 70 %, but its outcome on protein levels has not been determined. and the remaining protein level could be sufficient to maintain shuttle activity. Therefore, the effect of silencing at protein level should be analyzed, because as authors also point out on page 16; "mRNA levels may not reflect actual protein levels or activity".
      2. In the case of aspartate/glutamate carriers (AGCs) the role of citrin/slc25a13, the second AGC paralog, should also be analyzed. This AGC isoform is discarded based on proteomic data from brown adipose tissue, but, as it is shown in figure 1B, its levels are similar those of Aralar/slc25a12, the only AGC silenced. Besides, primary brown adipocytes differentiated for 7 days are used here, and it is possible that factors such as culture conditions or differentiation itself could alter AGC levels. Therefore, it is necessary to determine the protein levels of citrin/AGC2, and, if necessary, downregulate it together with the Aralar/AGC1 isoform. citrin/AGC2 activity may be responsible for the observed difference between the OGC and Aralar/AGC1 KD adipocytes.
      3. OGC and Aralar/AGC1 silencing is associated with the accumulation of smaller lipid droplets and impaired norepinephrine-induced lipolysis, but no mechanistical evidence is provided. The authors discuss a role for AMPK signaling associated with the redox unbalance generated by MAS disfunction but neither of them is proven.

      Minor points

      1. Is pyruvate present in respiration medium? If so, no effect on respiration is expected as pyruvate reverses the respiratory defects caused by MAS inactivation.
      2. In figure 4, only data from Aralar KD cells in relation to norepinephrine-stimulated lipolysis are shown. What happens when OGC is silenced?
      3. Nomenclature used for mitochondrial carriers is confusing. Please do not use OGC1 as there is only one isoform. Furthermore, different names for OGC are used in the manuscript; oxoglutarate carrier, malate-ketoglutarate carrier or OGC1/SLC25A11. In the case of citrin/AGC2, Aralar2 is used and is a uncommon designation.
      4. Some panels of figures 3 and 4 should be improved. Panels 3J, 3L and 4G are difficult to see. In panel 3J please clarify UT line from untreated/NE, are they not transduced? No equivalents conditions are assayed in Aralar KD and OGC KO cells.

      Significance

      General assessment: The robust part of this study is its analysis of some aspects related to lipid metabolism in cultured primary cells derived from brown adipose tissue. The participating teams are well-versed in this topic and the approaches used are correct. However, no data in animal models supporting these results are provided and this fact rests interest.

      Advance: This manuscript is the "logical" continuation of a previous study, Veliova et al., (2020) EMBO Rep, more relevant in my opinion. Also, recently, it has been also proposed using animal models, either by overexpression or using deficient mice for GOT1 a cytosolic protein component of MAS, a role for MAS in BAT thermogenesis (Park et al., Cell Rep. 2025). The novelty in this manuscript is the analysis of deficient cells in the metabolite transporter that regulate the direction of NADH shuttling. However, since no evidence is provided its effect on NAD+/NADH ratio, the conclusions related to the role of MAS, or the mitochondrial carriers silenced, in the regulation of lipolysis in BAT and its involvement in thermogenesis are not convinced.

      Audience: These results could be of interest to the audience interested in basic research, but could also be useful in the translational/clinical area because they address metabolic aspects in adipose tissue.

      My expertise is focus on mitochondrial metabolism, specifically in the function of a subtype of mitochondrial carriers regulated by cytosolic calcium and how they participate in the control of different mitochondrial functions, such as respiration, calcium buffering, cell proliferation. Some of these transporters are components of MAS such as Aralar/AGC1 or citrin/AGC2.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript presents novel findings on the role of the malate-aspartate shuttle (MASh) in brown adipose tissue (BAT). Building on the recent advances in elucidating the contribution of MASh to BAT metabolism, the present study provides new evidence by offering direct biochemical validation using a reconstituted BAT mitochondrial system and by introducing genetic data on the mitochondrial carriers OGC1 and Aralar1, thereby adding significant new insight. However, the following points require further clarification.

      Major points:

      1. In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.

      Minor points:

      1. In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.
      2. In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly.
      3. In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.
      4. Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results.
      5. The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL.
      6. Red-on-black is not a great color code for IMFs, how about black-and-white?

      Referees cross-commenting

      To my opinion, all three reviewers have provided constructive criticism of the work.

      Significance

      The work dives deeper into mitochondrial function and metabolism of brown adipocytes and, thus, advances our understanding of thermogenesis in an incremental fashion. The work will be relevant to brown adipose tissue researchers and mitochondrial biologist.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper makes a clear, well-supported case that the malate-aspartate shuttle (MAS) is active in brown adipocytes and supports adrenergically stimulated lipolysis. The combination of a functional MAS assay, targeted carrier knockdowns, and multi-modal lipolysis measurements is a strong package. The reconstituted mitochondrial assay paired with live-cell lipolysis imaging is technically elegant and broadly reusable. The main gap is the limited in-vitro scope relative to in-vivo cold adaptation.

      Major comments

      1. The manuscript posits that the loss of function of MASh components (Ogc1 and Aralar) decreases adrenergic-stimulated lipolysis by altering the cytosolic NAD⁺/NADH ratio, with AMPK/ACC mentioned as possible mediators. However, this remains speculative. Please provide mechanistic data directly linking MASh-dependent NAD⁺/NADH changes to the regulation of lipolysis in brown adipocytes during adrenergic stimulation.
      2. The absence of in vivo analysis of lipid-droplet size in MASh loss-of-function models is a major concern. In vitro results could be confounded by differences in differentiation stage between groups. Please document equivalent adipogenesis across groups (e.g., Pparg/Cebpa/Plin1/Fabp4 expression)
      3. Please include rescue experiments (add-back OGC1 and Aralar) to rule out siRNA/shRNA off-target effects and verify that the phenotype stems from MASh loss of function.
      4. Please expand on physiological significance: What is the importance of MASh regulation of BAT lipolysis in long-term adaptive thermogenesis?

      Minor comments

      1. Fig. 4 legend/title contains a typo ("lypolysis" → lipolysis).
      2. In Fig. 2 legend line: "Adevirus-mediated" → Adenovirus-mediated; "OCAR" → OCR.
      3. For lipolysis imaging, you already show Forskolin/Atglistatin/Etomoxir controls; add a vehicle-only time course overlay in the main figure (currently in text/legend) to aid visual comparison.
      4. Ensure consistent gene symbols (Atgl/Pnpla2), and protein capitalization.

      Referees cross-commenting

      In my view, the feedback offered by all three reviewers has been highly constructive, as each of them has contributed thoughtful and meaningful criticism that can help improve the quality, clarity, and overall impact of the work.

      Significance

      Advance - how it fits the literature and what kind of advance.

      Relative to prior work linking MASh (often via GOT1) to fuel preference and redox during thermogenesis, this study fills a mechanistic gap by showing that carrier-level MASh disruption (Aralar1/OGC1) specifically impairs adrenergic lipid mobilization upstream of β-oxidation, while respiration per cell can be buffered by compensatory mitochondrial biogenesis (lower OCR per mitochondrion). Conceptual/fundamental advance: it sharpens the redox - lipolysis axis in BAT and clarifies why changes in fuel availability (lipolysis) may limit thermogenesis even when bulk OCR looks preserved.

      Audience - who will be interested/influenced.

      Specialized but cross-cutting: adipose biology & thermogenesis, mitochondrial/redox metabolism, lipid-droplet and lipolysis communities, and metabolic-disease researchers exploring strategies to modulate BAT fuel handling.

      Reviewer expertise

      Adipose tissue and systemic energy metabolism; mitochondrial bioenergetics; thermogenic mechanisms in BAT/beige fat; transcriptional and metabolic control of lipid mobilization. Not a specialist in membrane-carrier biophysics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer’s Comments

      We thank all three reviewers for their thoughtful and detailed comments, which will help us to improve the quality and clarity of our manuscript.


      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ Summary: In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments: 1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.

      Response: Our main purpose in examining pupal wing shape was to emphasize that wings lacking ds are visibly abnormal even at early pupal stages. The reviewer makes the point that the change in shape from 6h to 18h APF is greater in control wings than in RNAi-ds wings. We have added quantitation of this to the revised manuscript as suggested. This difference could be interpreted as indicating that Ds-Fat signaling actively contributes to wing shape during pupal morphogenesis. However, given the genetic evidence that Ds-Fat signaling influences wing shape only during larval growth, we favor the interpretation that it reflects consequences of Ds-Fat action during larval stages – eg, overgrowth of the wing, particularly the proximal wing and hinge as occurs in ds and fat mutants, could result in relatively less elongation during the pupal hinge contraction phase. This wouldn’t change our key conclusions, but it is something that we discuss in a revised manuscript.

      I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Response: The wing pouch measurements were done on 2D projections of wing discs that were already slightly flattened by coverslips, so there is not much curvature outside of the folds. We will revise the methods to make sure this is clear. While we recognize that the absolute values measured can be affected by this, our conclusions are based on the qualitative differences in proportions between genotypes and time points, and we wouldn’t expect these to differ significantly even if 3D distances were measured. Obtaining accurate 3D measures is technically more challenging - it requires having spacers matching the thickness of the wing disc, which varies at different time points and genotypes, and then measuring distances across curved surfaces. What we propose to address this is to do a limited set of 3D measures on wild-type and dsmutant wing discs at early and late stages and which we expect will confirm our expectation that the conclusions of our analysis are unaffected, while at the same time providing an indication of how much curvature affects the values obtained. We will also make sure the issue of wing disc curvature and folds is discussed in the text.

      Minor comments: 1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).

      Response: We will add measurements of recoil velocities to complement our current analysis of circular cuts.

      Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.

      Response: We include this statistical test in the revised manuscript (it shows that they are significantly different).

      In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Response: Thank-you for pointing this out, we have revised the manuscript accordingly.

      **Referee cross-commenting**

      Reviewer2: Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Response: We thank Reviewer 1 for their comments here. In terms of the region measured, we measure to the inner Wg ring in the disc, the location of this ring in the adult is actually more proximal than described above (eg see Fig 1B of Liu, X., Grammont, M. & Irvine, K. D. Roles for scalloped and vestigial in regulating cell affinity and interactions between the wing blade and the wing hinge. Developmental Biology 228, 287–303 (2000)), and this defines roughly the region we have measured in adult wings (with the caveat noted above that the measurements in the disc can be affected by curvature and the hinge/pouch fold, which we will address).

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. __The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024).__ In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Response: Thank-you, we agree with Reviewer 1 here.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Response: We agree with Reviewer 1 here and will also add quantitation of myosin across multiple discs and will include higher magnification myosin images and polarity tests.

      Reviewer 3: I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Response: Reviewer 3 comment on Fig 1 requests Ab stains to assess recovery of expression after downshift, which we will do.

      We will add examination of myosin localization in hpo RNAi wing discs, and in the ds/rok combinations. We note that the effects of Rok manipulations on myosin and on recoil velocity have been described previously (eg Rauskolb et al 2014).

      Reviewer #1 (Significance (Required)): I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ The manuscript begins with very nice data from a ts sensitive period experiment. Instead of a ts mutation, the authors induced RNAi in a temperature dependent manner. The results are striking and strong. Knockdown of FT or DS during larval stages to late L3 changed shape while knockdown of FT or DS during later pupal stages did not. This indicates they are required during larval, not pupal stages of wing development for this shape effect. They did shift-up or shift-down at "early pupa stage" but precisely what stage that means was not described anywhere in the manuscript. White prepupal? Time? Likewise a shift-down was done at "late L3" but that meaning is also vague. Moreover, I was surprised to see they did not do a shift-up at the late L3 stage, to give completeness to the experiment. Why?

      Response: We have added more precise descriptions of the timing, and we will also add the requested late L3 shift-up experiment.

      Looking at the "shape" of the larval wing pouch they see a difference in the mutants. The pouch can be approximated as an ellipse, but with differing topology to the adult wing. Here, they muddled the analysis. The adult wing surface is analogous to one hemisphere of the larval wing pouch, ie., either dorsal or ventral compartment. The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing. They confusingly call this latter metric the "DV length" and the former metric the "AP length" , and in fact they do not measure the PD length but PD+DP length. Confusing. Please change to make this consistent with earlier analysis of the adult and invert the reported ratio and divide by two.

      Then you would find the larval PD/AP ratio is smaller in the FT and DS mutants than wildtype, which resembles the smaller PD/AP ratio seen in the mutant adult wings. Totally consistent and also provides further evidence with the ts experiments that FT and DS exert shape effects in the larval phase of life.

      Response: As noted by Reviewer 1 in cross-referencing, some of the statements made by Reviewer 2 here are incorrect, eg “The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing.” They are correct where they note that the A-P length we measure in the discs is actually equivalent to 2x the adult wing length, since we are measuring along both the dorsal and ventral wing, but this makes no difference to the analysis as the point is to compare shape between time points and genotypes, not to make inferences based on the absolute numbers obtained. The numerical manipulations suggested are entirely feasible but we think they are unnecessary.

      The remainder of the manuscript has experimental results that are more problematic, and really the authors do not figure out how the shape effect in larval stages is altered. I outline below the main problems.

      1. They compare the FT DS shape phenotypes to those of mutants or knockdowns in Hippo pathway genes (Hippo is known to be downstream of FT and DS). They find these Hippo perturbations do have shape effects trending in same direction as FT and DS effects. Knockdown reduces the PD/AP ratio while overexpressing WARTS increases the PD/AP ratio. The effect magnitudes are not as strong, but then again, they are using hypomorphic alleles and RNAi, which often induces partial or hypomorphic phenotypes. The effect strength is comparable when wing pouches are young but then dissipates over time, while FT and DS effects do not dissipate over time. The complexity of the data do not negate the idea that Hippo signaling is also playing some role and could be downstream of FT and DS in all of this. But the authors really downplay the data to the point of stating "These results imply that Ds-Fat influences wing pouch shape during wing disc growth separately from its effects on Hippo signaling." I think a more expansive perspective is needed given the caveats of the experiments.

      Response: Our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth; we will revise the text to make sure our conclusions are clear.

                    The reviewer wonders whether some of the differences could be due to the nature of the alleles or gene knockdown. First, the *ex*, *ds*, and *fat* alleles that we use are null alleles (eg see FlyBase), so it is not correct to say that we use only hypomorphic alleles and RNAi. We do use a hypomorphic allele for wts, and RNAi for hpo, for the simple reason that null alleles in these genes are lethal, so adult wings could not be examined. A further issue that is not commented on by the reviewer, but is more relevant here, is that there are multiple inputs into Hippo signaling, so of course even a null allele for ex, ds or fat is not a complete shutdown of Hippo signaling. Nonetheless, one can estimate the relative impairment of Hippo signaling by measuring the increased size of the wings, and from this perspective the knockdown conditions that we use are associated with roughly comparable levels of Hippo pathway impairment, so we stand by our results. We do however, recognize that these issues could be discussed more clearly in the text, and will do so in a revised manuscript.
      

      Puzzlingly, this lack of taking seriously a set of complex results does not transfer to another set of experiments in which they inhibit or activate ROK, the rho kinase. When ROK is perturbed, they also see weak effects on shape when compared to FT or DS perturbation. This weakness is seen in adults, larvae, clones and in epistasis experiments. The epistasis experiment in particular convincingly shows that constitutuve ROK activation is not epistatic to loss of DS; in fact if anything the DS phenotype suppresses the ROK phenotype. These results also show that one cannot simply explain what FT and DS are doing with some single pathway or effector molecule like ROK. It is more complex than that.

      What I really think was needed were experiments combining FT and DS knockdown with other mutants or knockdowns in the Hippo and Rho pathways, and even combining Hippo and Rho pathway mutants with FT or DS intact, to see if there are genetic interactions (additive, synergistic, epistatic) that could untangle the phenotypic complexity.

      Response: We’re puzzled by these comments. First, we never claimed that what Fat or Ds do could be explained simply by manipulation of Rok (eg, see Discussion). Moreover, examination of wings and wing discs where ds is combined with Rho manipulations is in Fig 7, and Hippo and Rho pathway manipulation combinations are in Fig S5. We don’t think that combining ds or fat mutations with other Hippo pathway mutations would be informative, as it is well established that Ds-Fat are upstream regulators of Hippo signaling.

      Laser cutting experiments were done to see if there is anisotropy in tissue tension within the wing pouch. This was to test a favored idea that FT and DS activity generates anisotropy in tissue tension, thereby controlling overall anisotropic shape of the pouch. However there is a fundamental flaw to their laser cutting analysis. Laser cutting is a technique used to measure mechanical tension, with initial recoil velocity directly proportional to the tissue's tension. By cutting a small line and observing how quickly the edges of the cut snap apart, people can quantify the initial recoil velocity and infer the stored mechanical stress in the tissue at the time of ablation. Live imaging with high-speed microscopy is required to capture the immediate response of the tissue to the cut since initial recoil velocity occurs in the first few seconds. A kymograph is created by plotting the movement of the tissue edges over this time scale, perpendicular to the cut. The initial recoil velocity is the slope of the kymograph at time zero, representing how fast the severed edges move apart. A higher recoil velocity indicates higher mechanical tension in the tissue. However, the authors did not measure this initial recoil velocity but instead measured the distance between the severed edges at one time point: 60 seconds after cutting. This is much later than the time point at which the recoil usually begins to dissipate or decay. This decay phase typically lasts a minute or two, during which time the edges continue to separate but at a progressively slower rate. This time-dependent decay of the recoil reveals whether the tissue behaves more like a viscous fluid or an elastic solid. Therefore, the distance metric at 60 seconds is a measurement of both tension and the material properties of the cells. One cannot know then whether a difference in the distance is due to a difference in tension or fluidity of the cells. If the authors made measurements of edge separation at several time points in the first 10 seconds after ablation, they can deconvolute the two. Otherwise their analysis is inconclusive. Anisotropy in recoil could be caused by greater tissue fluidity along one axis. Observing a gradient of cell fluidity in a tissue along one axis of a tissue has been observed in the amnioserosa of Tribolium for example. (Related and important point - was the anisotropy of recoil oriented along the PD or AP axis or not oriented to either axis, this key point was never stated)..

      The authors cannot definitiviely conclude anything about mechanical tension from their reported cutting data.

      Response: As noted by Reviewer 1 in cross-commenting, there is no fluidity on a time scale of 1 minute in the wing disc, and circular ablations are an established methods to investigate tissue stress. We choose the circular ablation method in part because it interrogates stress over a larger area, whereas cutting individual junctions is subject to more variability, particularly as the orientation of the junction (eg radial vs tangential) impacts the tension detected in the wing disc. Nonetheless, we will add recoil measurements to the revised manuscript to complement our circular ablations, which we expect will provide independent confirmation of our results and address the Reviewer’s concern here.

      They measured the eccentricity of wing pouch cells near the pouch border, and found they were highly anisotropic compared to DS mutant cells at comparable locations. Cells were elongated but again what if either axis (PD or AP) they were elongated along was never stated. If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges. Thus, recoil velocity after laaser cutting would be stonger along the axis aligned with short cell edges. It looks like the cutting anisotropy they see is greater along the axis aligned with long cell edges. Of course, if the cell anisotropy is caused by a pulling force exerted by the pouch boundary, then it would stretch the cells. This would in fact fit their cutting data. But then again, the observed cell anisotropy could also be caused by variation in the fluid-solid properties of the wing cells as discussed earlier. Compression of the cells then would deform them anisotropically and produce the anisotropic shapes that were observed, Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult,

      Response: As noted by Reviewer 1 in cross-commenting, it is well established that tension and myosin are higher along long edges in the proximal wing. However, we acknowledge that we could do a better job of making the location and orientation of the regions shown in these experiments clear and, we will address this in a revised manuscript.

      The imaging and analysis of the myosin RLC by GFP tagging is also flawed. SQH-GFP is a tried and true proxy for myosin activity in Drosophila. Although the authors image the wing pouch of wildtype and DS mutants. they did so under low magnification to image the entire pouch. This gives a "low-res" perspective of overall myosin but what they needed to do was image at high magnification in that proximal region of the pouch and see if Sqh-GFP is polarized in wildtype cells along certain cell edges aligned with an axis. And if such a polarity is observed, is it present or absent in the DS mutant. From the data shown in Figure 5, I cannot see any significant difference between wildtype and knocked down samples at this low resolution. Any difference, if there is any, is not really interpretable.

      Response: We agree that examination of myosin localization at high resolution to see if it is polarized is a worthwhile experiment. We did in fact do this, and myosin (Sqh:GFP) appeared unpolarized in ds mutants. However, the levels of myosin were so low that we didn’t feel confident in our assessment, so we didn’t include it. We now recognize that this was a mistake, and we will include high resolution myosin images and assessments of (lack of) polarity in a revised manuscript to address this comment.

      In conclusion, the manuscript has multiple problems that make it imposiible for the authors to make the claims they make in the current manuscript. And even if they calibrated their interpretations to fit the data, there is not much of a simple clear picture as to how FT and DS regulate pouch eccentricity in the larval wing.

      Response: We think that the legitimate issues raised are addressable, as described above, while some of the criticisms are incorrect (as noted by Reviewer 1).

      Reviewer #2 (Significance (Required)): This manuscript describes experiments studying the role that the protocadherins FAT and DACHSOUS play in determining the two dimensional "shape" of the fruit fly wing. By "shape", the manuscript really means how much the wing's outline, when approximated as an ellipse, deviates from a circle. The elliptical approximations of FT and DS mutant wings more closely resemble a circle compared to the more eccentric wildtype wings. This suggests the molecules contribute to anisotropic growth in some way. A great deal of attention has been paid on how FT and DS regulate overall organ growth and planar cell polarity, and the Irvine lab has made extensive contributions to these questions over the years. Somewhat understudied is how FT and DS regulate wing shape, and this manuscript focuses on that. It follows up on an interesting result that the Irvine lab published in 2019, in which mud mutants randomized spindle pole orientation in wing cells but did not change the eccentricity of wings, ruling out biased cell division orientation as a mechanism for the anisotropic growth.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ Summary The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1: The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Response: We will do the requested antibody stains for Fat (Ds antibody is unfortunately no longer available, but the point made by the reviewer can be addressed by Fat as the approach and results are the same for both genes). We have also added the requested statistical analysis to Fig 1P, and adjusted the scales as requested.

      Figure 2: The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Response: As noted in our response to point 1 of Reviewer 1, we agree that there does seem to be relatively more elongation in control wings than in ds RNAi wings, but we think this likely reflects effects of ds on growth during larval stages, and we will revise the manuscript to comment on this.

      We will also add the suggested examination of fat RNAi pupal wings.

      The suggested examination of pupal wing shape in downshift experiments is unfortunately not feasible. Our temperature shift experiments expressing ds or fat RNAi are done using the UAS-Gal4-Gal80tssystem. We also use the UAS-Gal4 system to mark the pupal wing. If we do a downshift experiment, then expression of the fluorescent marker will be shut down in parallel with the shut down of ds or fat RNAi, so the pupal wings would no longer be visible.

      Figure 3: The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Response: First, I think we are largely in agreement with the Reviewer, as the basis for our saying that DS-Fat are likely required during initial formation of the wing pouch is that our data show they must be required before 72 h AEL. Second, 72 h is the earliest that we can look using Wg expression as a marker, as at earlier stages it is in a ventral wedge rather than a ring around the future wing pouch + DV line (eg see Fig 8 of Tripathi, B. K. & Irvine, K. D. The wing imaginal disc. Genetics (2022) doi:10.1093/genetics/iyac020.). We can revise the text to make sure this is clear.

      Figure 4: The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Response: As noted in our response to point 1 of Reviewer 2 - our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth. We will make some revisions to the text to make sure that our conclusions are clear throughout.

      While we used a hypomorphic allele for wts, because null alleles are lethal, the ex allele that we used is described in Flybase as an amorph, not a hypomorph, and as noted in our response to Reviewer 2, we will add some discussion about relative strength of effects on Hippo signaling.

      In Fig S1, we currently show adult wings for ex[e1] and RNAi-Hpo, and wing discs for wts[P2]/wts[x1], and for ex[e1]. The wts combination does not survive to adult so we can’t include this. We will however, add hpo RNAi wing discs as requested.

                    The purpose of including InR^CA experiments is to try to separate effects of Hippo signaling from effects of growth, because InR signaling manipulation provides a distinct mechanism for increasing growth. We will revise the text to try to make sure this is clearer.
      

      Figure 5: This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Response: We will revise the quantitation so that it presents analysis of averages across multiple discs, rather than representative examples of single discs.

      Both the myosin imaging, and the laser ablation were done on the same genotypes (wildtype and ds) at the same ages (108 h AEL) so we think it is valid to directly compare them. Moreover, the imaging conditions for laser ablation and myo quantification are different, so it’s not feasible to do them at the same time (For ablations we do a single Z plane and a single channel (has to include Ecad, or an equivalent junctional marker) on live discs, so that fast imaging can be done. For Myo imaging we do multiple Z stacks and multiple channels (eg Ecad and Myo), which is not compatible with the fast imaging needed for analysis of laser ablations).

      Figure 6: It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H. Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Response: In these Rok experiments there was no separate temporal control of Rok RNAi or Rok^CA expression, they were expressed under nub-Gal4 control throughout development.

      We will add examination of myosin in combinations of ds RNAi and rok manipulation as in Fig 7G to a revised manuscript.

      Data for fat and ds comparable to that shown in Fig 6H is already presented in Fig 3D, and we don’t think its necessary to reproduce this again in Fig 6H.

      We agree that the effects of Rok manipulations are milder than those of Fat manipulations; as we try to discuss, this could be because the pattern or polarity of myosin is also important, not just the absolute level, and we will add assessment of myosin polarity.

      The suggestion to also look at dachs mutants is reasonable, and we will add this. In addition, we plan to add an "activated" Dachs (a Zyxin-Dachs fusion protein previously described in Pan et al 2013) that we anticipate will provide further evidence that the effects of Ds-Fat are mediated through Dachs. We will also add the suggested experiment combining Rok activation with dachs loss-of-function.

      Figure 7: The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Response: As discussed above, our data clearly show that Fat has effects independently of Hippo signaling that are crucial for its effects on wing shape, but we did not mean to imply that the regulation of Hippo signaling by Fat makes no contribution to wing shape control, and we will revise the text to make this clearer. We will also add additional analysis of Myosin localization , as described above.

      Reviewer #3 (Significance (Required)): How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments

      To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1:

      The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Figure 2:

      The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Figure 3:

      The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Figure 4:

      The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Figure 5:

      This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Figure 6:

      It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H.<br /> Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Figure 7:

      The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Significance

      How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript begins with very nice data from a ts sensitive period experiment. Instead of a ts mutation, the authors induced RNAi in a temperature dependent manner. The results are striking and strong. Knockdown of FT or DS during larval stages to late L3 changed shape while knockdown of FT or DS during later pupal stages did not. This indicates they are required during larval, not pupal stages of wing development for this shape effect. They did shift-up or shift-down at "early pupa stage" but precisely what stage that means was not described anywhere in the manuscript. White prepupal? Time? Likewise a shift-down was done at "late L3" but that meaning is also vague. Moreover, I was surprised to see they did not do a shift-up at the late L3 stage, to give completeness to the experiment. Why?

      Looking at the "shape" of the larval wing pouch they see a difference in the mutants. The pouch can be approximated as an ellipse, but with differing topology to the adult wing. Here, they muddled the analysis. The adult wing surface is analogous to one hemisphere of the larval wing pouch, ie., either dorsal or ventral compartment. The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing. They confusingly call this latter metric the "DV length" and the former metric the "AP length" , and in fact they do not measure the PD length but PD+DP length. Confusing. Please change to make this consistent with earlier analysis of the adult and invert the reported ratio and divide by two. Then you would find the larval PD/AP ratio is smaller in the FT and DS mutants than wildtype, which resembles the smaller PD/AP ratio seen in the mutant adult wings. Totally consistent and also provides further evidence with the ts experiments that FT and DS exert shape effects in the larval phase of life.

      The remainder of the manuscript has experimental results that are more problematic, and really the authors do not figure out how the shape effect in larval stages is altered. I outline below the main problems.

      1. They compare the FT DS shape phenotypes to those of mutants or knockdowns in Hippo pathway genes (Hippo is known to be downstream of FT and DS). They find these Hippo perturbations do have shape effects trending in same direction as FT and DS effects. Knockdown reduces the PD/AP ratio while overexpressing WARTS increases the PD/AP ratio. The effect magnitudes are not as strong, but then again, they are using hypomorphic alleles and RNAi, which often induces partial or hypomorphic phenotypes. The effect strength is comparable when wing pouches are young but then dissipates over time, while FT and DS effects do not dissipate over time. The complexity of the data do not negate the idea that Hippo signaling is also playing some role and could be downstream of FT and DS in all of this. But the authors really downplay the data to the point of stating "These results imply that Ds-Fat influences wing pouch shape during wing disc growth separately from its effects on Hippo signaling." I think a more expansive perspective is needed given the caveats of the experiments.

      Puzzlingly, this lack of taking seriously a set of complex results does not transfer to another set of experiments in which they inhibit or activate ROK, the rho kinase. When ROK is perturbed, they also see weak effects on shape when compared to FT or DS perturbation. This weakness is seen in adults, larvae, clones and in epistasis experiments. The epistasis experiment in particular convincingly shows that constitutuve ROK activation is not epistatic to loss of DS; in fact if anything the DS phenotype suppresses the ROK phenotype. These results also show that one cannot simply explain what FT and DS are doing with some single pathway or effector molecule like ROK. It is more complex than that.

      What I really think was needed were experiments combining FT and DS knockdown with other mutants or knockdowns in the Hippo and Rho pathways, and even combining Hippo and Rho pathway mutants with FT or DS intact, to see if there are genetic interactions (additive, synergistic, epistatic) that could untangle the phenotypic complexity. 2. Laser cutting experiments were done to see if there is anisotropy in tissue tension within the wing pouch. This was to test a favored idea that FT and DS activity generates anisotropy in tissue tension, thereby controlling overall anisotropic shape of the pouch. However there is a fundamental flaw to their laser cutting analysis. Laser cutting is a technique used to measure mechanical tension, with initial recoil velocity directly proportional to the tissue's tension. By cutting a small line and observing how quickly the edges of the cut snap apart, people can quantify the initial recoil velocity and infer the stored mechanical stress in the tissue at the time of ablation. Live imaging with high-speed microscopy is required to capture the immediate response of the tissue to the cut since initial recoil velocity occurs in the first few seconds. A kymograph is created by plotting the movement of the tissue edges over this time scale, perpendicular to the cut. The initial recoil velocity is the slope of the kymograph at time zero, representing how fast the severed edges move apart. A higher recoil velocity indicates higher mechanical tension in the tissue. However, the authors did not measure this initial recoil velocity but instead measured the distance between the severed edges at one time point: 60 seconds after cutting. This is much later than the time point at which the recoil usually begins to dissipate or decay. This decay phase typically lasts a minute or two, during which time the edges continue to separate but at a progressively slower rate. This time-dependent decay of the recoil reveals whether the tissue behaves more like a viscous fluid or an elastic solid. Therefore, the distance metric at 60 seconds is a measurement of both tension and the material properties of the cells. One cannot know then whether a difference in the distance is due to a difference in tension or fluidity of the cells. If the authors made measurements of edge separation at several time points in the first 10 seconds after ablation, they can deconvolute the two. Otherwise their analysis is inconclusive. Anisotropy in recoil could be caused by greater tissue fluidity along one axis. Observing a gradient of cell fluidity in a tissue along one axis of a tissue has been observed in the amnioserosa of Tribolium for example. (Related and important point - was the anisotropy of recoil oriented along the PD or AP axis or not oriented to either axis, this key point was never stated)..

      The authors cannot definitiviely conclude anything about mechanical tension from their reported cutting data. 3. They measured the eccentricity of wing pouch cells near the pouch border, and found they were highly anisotropic compared to DS mutant cells at comparable locations. Cells were elongated but again what if either axis (PD or AP) they were elongated along was never stated. If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges. Thus, recoil velocity after laaser cutting would be stonger along the axis aligned with short cell edges. It looks like the cutting anisotropy they see is greater along the axis aligned with long cell edges. Of course, if the cell anisotropy is caused by a pulling force exerted by the pouch boundary, then it would stretch the cells. This would in fact fit their cutting data. But then again, the observed cell anisotropy could also be caused by variation in the fluid-solid properties of the wing cells as discussed earlier. Compression of the cells then would deform them anisotropically and produce the anisotropic shapes that were observed, Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult, 4. The imaging and analysis of the myosin RLC by GFP tagging is also flawed. SQH-GFP is a tried and true proxy for myosin activity in Drosophila. Although the authors image the wing pouch of wildtype and DS mutants. they did so under low magnification to image the entire pouch. This gives a "low-res" perspective of overall myosin but what they needed to do was image at high magnification in that proximal region of the pouch and see if Sqh-GFP is polarized in wildtype cells along certain cell edges aligned with an axis. And if such a polarity is observed, is it present or absent in the DS mutant. From the data shown in Figure 5, I cannot see any significant difference between wildtype and knocked down samples at this low resolution. Any difference, if there is any, is not really interpretable.

      In conclusion, the manuscript has multiple problems that make it imposiible for the authors to make the claims they make in the current manuscript. And even if they calibrated their interpretations to fit the data, there is not much of a simple clear picture as to how FT and DS regulate pouch eccentricity in the larval wing.

      Significance

      This manuscript describes experiments studying the role that the protocadherins FAT and DACHSOUS play in determining the two dimensional "shape" of the fruit fly wing. By "shape", the manuscript really means how much the wing's outline, when approximated as an ellipse, deviates from a circle. The elliptical approximations of FT and DS mutant wings more closely resemble a circle compared to the more eccentric wildtype wings. This suggests the molecules contribute to anisotropic growth in some way. A great deal of attention has been paid on how FT and DS regulate overall organ growth and planar cell polarity, and the Irvine lab has made extensive contributions to these questions over the years. Somewhat understudied is how FT and DS regulate wing shape, and this manuscript focuses on that. It follows up on an interesting result that the Irvine lab published in 2019, in which mud mutants randomized spindle pole orientation in wing cells but did not change the eccentricity of wings, ruling out biased cell division orientation as a mechanism for the anisotropic growth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments:

      1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.
      2. I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Minor comments:

      1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).
      2. Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.
      3. In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Referee cross-commenting

      Reviewer2:

      Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024). In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Reviewer 3:

      I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Significance

      I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall Response.

      We would like to thank the reviewers for their analysis of the manuscript. From their comments it is clear that our manuscript was not. We completely rewrote the manuscript to focus on the central core question which was how does Adam13 regulates gene expression in general and TFap2a in particular leading to the expression of Calpain8 a protein required for CNC migration.

      The following model will be the central line of our story. It will address all of the proteins involved and mechanistical evidences that link Adam13 to one of its proven effector target Calpain8.

      • *

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      In this manuscript, Pandey et al. show that the ADAM13 protein modulates histone modifications in cranical neural crest and that the Arid3a protein binds the Tfap2a promoter in an Adam13-dependent manner and has promoter-specific effects on transcription. Furthermore, they show that the Adam13 and human ADAM9 proteins associated with histone modifiers as well as proteins involved in RNA splicing. Although the manuscript is mostly clearly written and the figures well assembled, it reads like a couple of separate and unfinished stories.*

      I believe that our story line was not clear and that the overarching questions was not well stated. We have made every effort to change this in the revised manuscript. I would like to include a figure that explains the story.

      In short:

      1 We knew that Adam13 could regulate gene expression in CNC via its cytoplasmic domain.

      2 We also knew that this required Adam13 interaction with Arid3a and that a direct target with the transcription factor TFAP2a which in turn regulates functional targets that we had identified including the protocadherin PCNS and the protease Calpain8.

      Our goal was to understand the mechanism allowing Adam13 to regulate gene expression.

      3 This first part of this manuscript shows how Adam13 modulates Histone modification in vivo in the CNC globally as well as specifically on the Tfap2a promoter. This results I an Open chromatin.

      4 Using Chip we show that Adam13 and Arid3a both bind to the Tfap2a promoter and that Arid3a binding to the first ATG depends on Adam13.

      5 Using Luciferase reporter we show that both Adam13 and Arid3a can induce expression at the first ATG.

      *They show using immunocytochemistry and qPCR that ADAM13 knockouts in CNCs afffects histone modifications. Here ChIP-seq or Cut-n-Run experiments would be more appropriate and would result in a more comprehensive understanding of the changes mediated. *

      I agree but we did not have the fund and now I have nobody working in the lab to do this experiment. These are also likely to overlap with the RNAseq data that we have and would simply add more open leads. We selected to go after the only direct target that we know which is TFAP2a and focus on this gene to understand the mechanism.

      We believe that the Chip PCR experiment are sufficient for this story.

      *The immunohistochemistry assays should at least be verified further using western blotting or other more quantiative methods. *

      Immunofluorescence and statistical analysis is a valid quantification method. Western blot of CNC explants is not trivial and requires a large amount of material. Given the small overall change we also would not expect to be able to detect the change over the noise of western blot. The Chip PCR confirms our finding in a completely independent manner.

      *The authors then show that ADAM13 interacts with a number of histone modifiers such as KDM3B, KDM4B and KMT2A but strangely they do not follow up this interesting observation to map the interactions further (apart from a co-ip with KMT2A), the domains involved, the functional role of the interactions or how they mediate the changes in chromatin modifications. *

      We selected KMT2a because it is expressed in the Hek293T cells. KMT2D has been shown to regulate CNC development in Xenopus and is responsible for the Kabuki syndrome in human. We used aphafold to predict interaction and found that Adam13 interact with the Set domain. In addition we see multiple Set- containing domain protein in our mass spec data. The mass spec is done on Human hek293T cells that express a subset of KMT proteins. We now include evidence that Adam13 interact with the KMT2D SET domain (new figure 5D)

      The authors then show that ADAM13 affects expression of the TFAP2a gene in a promoter specific manner - affecting expression from S1 but not S2.

      It is the S1 but not S3. Adam13 has no effect on S2.

      • They further show that ADAM13 affects the binding of the Arid3 transcription fator to the S1-promoter but not to the S3 promoter. However, ADAM13 was present at both promoters. Absence of ADAM13 resulted in increased H3K9me2/3 and decreased H3K4me3 at the S1 promoter whereas only H3K4me3 was changed at the S2 promoter*

      S3 not S2*. Unfortunately, they do not show how this is mediated or through which binding elements this takes place. Why is ADAM13 present at both promoters but only affects Arid3 binding at S1? *

      We agree this is a very interesting question that could be the subject of an entire publication. Promoter deletion and mutation to identify which site are bound by and modulated by Adam13/Arid3a is not trivial.

      *The authors claim that transfecting Arid3a and Adam13 together further increases expression from a reporter (Fig 4E) but this is not true as no statistical comparison is done between the singly transfected and double transfected cells. *

      This is correct, there is a small increase that is not significant with both. The fact that both proteins can induce the promoter suggest but does not prove that they can have additive roles. The loss of function experiment shows that the human Arid3a expressed in Hek293T cells is important for Adam13 increases of S1. It is possible that the dose of the endogenous Arid3a is sufficient to get full activity of Adam13.* Then the authors surprisingly start investigating association of proteins with the two isoforms of TFAP2a which in the mind of this reviewer is a different question entirely. *

      We agree and have removed this part of the manuscript.

      *They find a number of proteins involved in splicing. And the observation that ADAM13 also interacts with splicing factors is really irrelevant in terms of the story that they are trying to tell. Transcription regulation and splicing are different processes and although both affect the final outcome, mRNA, they need to be investigated separately. The link is at least not very clear from the manuscript. Again, the effects on splicing are not further investigated through functional analysis and as presented the data presented is too open-ended and lacking in clarity. *

      We agree that beside the different activities of the TFap2a isoform, the rest of the splicing regulation could be a separate study. We were interested to understand how these two isoforms could activate Calpain8 so differently this is why we looked at LC/MS/MS. We have removed this part of the story from the manuscript.*

      Additional points: 1. In the abstract they propose that the ADAMs may act as extracellular sensors. This is not substantiated by the results. *

      As an extracellular protein translocating into the nucleus it is a possibility that we propose, but I agree this is not investigated in this manuscript. We will modify the text.* 2. Page 5, line 16: what is referred to by 6 samples 897 proteins? Were 6 samples analyzed for each condition? The number of repeats for the mass spec analysis is not clear from the text nor are the statistical parameters used to analyse the data. This is also true for the mass spec presented in the part on TFAP2aL-S1 and Adam13 regulate splicing. Statistics and repeats are not presented. *

      In general we provide biological triplicate and use the statistical function of Scaffold to identify proteins that are significantly enriched or absent in each samples.

      When we specify 6 samples it means 6 independent proteins samples were analyzed and used for our statistic. We use Scafold T-test with a p value less than 0.05. Peptides were identified with 95% confidence and proteins with 99% confidence.* 3. Page 6, line 19: set domain should be SET domain. *

      Yes* 4. The number of repeats in the RNA sequencing of the CNCs is not clear from the text. *

      Three biological replicates (Different batch of embryos from different females).* 5. The explanation of Figure C is a bit lacking. There are two forms of TFAP2a, L and S, but only one is presented in the figure. Do both forms have the extra S1-3 exons? Also, at the top of the figure it is not clear that the boxes are part of a continuous DNA sequence. Also, it is not clear which codon is not coding. *

      Xenopus laevis are pseudo tetraploid giving in most cases L and S genes in addition to the 2 alleles from being diploid. The TFAP2a gene structure is conserved between both aloalleles and is similar to the human gene. For promoter analysis and Chip PCR we chose one of the alloallele (L), given that the RNAseq data showed that both genes and variant behave the same in response to Adam13. This only becomes important in loss of function experiment in which both L and S version need to be knock down or Knock out.

      * In the sashimi plot there are green and pink shaded areas. What do they denote? What exactly is lacking in the MO13 mutant - seems that a particular exon is missing suggesting skipping?*

      MO13 is a morpholino that bocks the translation of Adam13 (Already characterized with >90% of the protein absent) but does not affect Adam13 mRNA expression.* 7. Page 11, line 9: „with either MbC or MbC and MO13" needs to be rephrased. *

      Will do *8. Page 11, line 19: „the c-terminus of....and S3) and" should be „the C-terminus of...and S3 and". ** 9. Page 15, line 10: substrateS 10. Page 16, line 23: the sentence „increases H3K9 to the promoter of the most upstream" needs revision. 11. Page 26, line 12: Here the authors say: „for two samples two-tail unpaired". What does this mean? Statistics should not be performed on fewer than three samples. In legnd to Figure 6 it indicates that T-test was performed on two samples. 12. The discussion should be shortened and simplified. 13. Figure 1 legend. How many images were quantitated for each condition? *

      At least 3 images per condition. For 3 independent experiments. (9 images per condition).* 14. Figure 2 has a strange order of panels where G is below B. 15. Figure 6 legend, line 12. „proteins that were significantly enriched in either of the 2 samples" is not very clear. What exactly does this mean?

      Reviewer #1 (Significance (Required)):

      If the authors follow up on either the transcription-part of the story, or the splicing part of the story, they are likely to have important results to present. However, in the present format the paper is lacking in focus as both issues are mixed together without a clear end-result. *

      We have entirely changed the paper according to these comments.

      *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): **

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below: *

      Clarity is clearly an issue here. The new version is entirely re-written.

      Here is the take home message:

      We knew that Adam13 could regulate gene expression via its cytoplasmic domain. One of the key targets was identified as Calpain8, a protein critical for CNC migration. We subsequently showed that Adam13 and Arid3a regulated Tfap2a expression which in turn regulated Calpain8.

      In this manuscript we investigated 1) how Adam13 regulates TFAP2a and 2) how Tfap2a controls Calpain8 expression.

      The take home message is that Adam13 bind to Histone methyl transferase and changes the histone methylation code overall in the CNC and in particular at the TFAP2a promoter. This results in more open chromatin. We further find that Adam13 binds to the Tfap2a promoter in vivo and is important for Arid3a binding to the first start. Tfap2a that include this N-terminus sequence regulates Capn8 expression.*

      Major comments: 1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow. *

      Agree but I believe that the S1 vs S3 story of Tfap2a is important for the overall story. The new paper does not emphasize splicing.* 2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. *

      The statistical analysis shows that the results, while modest, are significant (Three independent experiments using 3 different females and 3 explants for each condition were analyzed). The edge effect observed is eliminated by the mask that we use that normalize the expression to either DAPI or Snai2. The edge effect is seen in both control and KD as well. These are further confirmed by the Chip PCR on one direct target.

      Similarly the Arid3a expression in Supp Figure 1 if anything seems increased.

      We have previously shown that Arid3a expression is not affected by Adam13 KD (Khedgikar et al). Our point here is simply that the difference in Tfap2a cannot be explained by a decrease in Arid3a expression. It is not a critical figure and was eliminated in the new manuscript.

      *It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. *

      Not all antibodies used here work by western blot and the quantity of material required for western blot is much larger than IF. Given the small overall changes and the variability observed in Western blot it is not a viable alternative.

      IF is a quantitative method that has been used widely to assert increase or decrease of protein level or post translational modification. The fact that the same post translational modification that we see in cranial neural crest explants can also be seen by ChipPCR on the Tfap2a promoter confirm this observation.

      *Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC? *

      These are CNC explants. It is now clearly stated in the figure legend.* 3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. *

      The new manuscript is clarifying that point. Because we are using Hek293T cells in this assay, which are human embryonic kidney derived instead of Xenopus Cranial neural crest cells, we are not interested in a specific protein but rather a family of protein that can modify histones (KMT and KDM). Our rational is if Adam13 can bind to KMT2 via the SET domain, it is likely to interact with KTM2 that are expressed in the CNC. KMT2A and D are expressed in the CNC. This is why we selected KMT2a here (Hek293T). We now include 1 co-IP with the Set domain of Xenopus KMT2D (new figure 5D)

      From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others.

      The new manuscript addresses this point. We did not show or expect that the loss of Adam13 would affect mRNA expression of Kmt2.

      *Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here. *

      We have used another set of proteomics data that does not include the cytoplasmic/nuclear extract to simplify the results. We hope that the changes make it more obvious.

      Given that we are looking at Chromatin remodeling enzyme here we did not chose to investigate further in this report the ATPase. This is such a wide category that it could lead us away from the main story here.* 4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. *

      We agree and think that a simple representation of the fold change of the different isoform is more obvious. It is now a minor part of figure 1 and the legend has been improved to describe the method here.

      How do you tell if the interactions are changed from this?

      I do not understand this question. The sashimi plot indicate the read through from the mRNA that goes from one exon to the next quantifying the specific exon usage. It can therefore be quantified and compared between different conditions.

      • The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants? *

      We have removed this figure as we had already shown previously by western blot that Tfap2a protein decreased in MO13 embryos. As noted on the histogram, the fluorescence is only measured in Sox9 positive cells in each explant. Three independent experiments with 3 explants for each. We also have seen a decrease by Western blot and mRNA expression (Both RNAseq and realtime PCR). In most of our explants, the vast majority of the cells are positive for Snai2 and Sox9, while those that are negative are positive for Sox3 (data not shown here). There is always less signal in the center of the explant possibly due to the penetration of antibody or interference with the signal by the cells pigment or yolk autofluorescence. Our control explants have the same effect so our quantification is valid.* 5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? *

      All of the KO were validated by sequencing, RNAseq and protein expression. These are now included in the supplemental figure 1.

      *More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. *

      All ChipPCR were performed on Xenopus embryos. The variability is tested by statistical analysis and is either significant or not.

      Because these are in cell lines, this should be more consistent.

      They are not in cell lines but in Xenopus embryos.

      • In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay. *

      We use Luciferase assay in Hek293T cells to test if Xenopus protein can induce a specific reporter (Gain of function). We also use luciferase reporter in Xenopus to test if they can perceive the loss of a specific protein (For example Adam13).

      Our result show that Adam13 or Arid3a expression in Hek293T cells can induce the TFAP2S1 reporter. * 6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion.*

      We can certainly include this but have published this assay in multiple publication before. The picture is a single example, the histogram shows that statistical validation.

      • The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration. *

      The result in Hek293T cells shows that only TFAP2aS1 can induce Calpain8, while both S1 and S3 can partially rescue CNC migration in embryos lacking Adam13. The issue here is the dose of mRNA injected for each variant might be too high. Adam13 proteolytic activity is also critical, so we do not expect a complete rescue. The fact that S1 is significantly better at rescuing than S3 is relevant here. It is possible that if we were to decrease the dose of each mRNA we would find one in which S3 no longer rescues but S1 does.

      * The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.*

      Calpain8 is one of the validate target of Adam13 that can rescue CNC migration (Cousin et al Dev Cell). We use the luciferase reporter corresponding to the Xenopus Capn8 reporter to show 1 in vivo that loss of Adam13 reduce its expression (Similar to the Capn8 gene). We then went in vitro using Hek293T cells for gain of function experiment that shows that only the Tfaps2S1 variant can induce it while S3 does not.

      We hope that the graphical summary and the new manuscript make this clear.* 8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript *

      This figure is no longer included. For each of the protein classes that we identify by Masspec we try to find a validation. RNA-IP is simply a validation that Adam13 and Adam9 can bind to complexes that include RNA in a cytoplasmic domain dependent fashion. The conclusion that Adam13 and possibly ADAM9 might be involved in regulating splicing is 1) that the protein associated with Adam13 are include multiple splicing factors, 2) that the RNAseq analysis shows abnormal splicing in CNC missing Adam13 and 3) that the form of TFAP2a induced by Adam13 (S1) associate significantly more with splicing factor than the S3 isoform.

      We agree that the generalization to other ADAM is not demonstrated here but only suggested. We selected ADAM9 and ADAM19 because we have shown that they can each rescue Adam13 function in the CNC. Unfortunately there are no ADAM19 antibody that work by IP on the market. We have tested multiple company and multiple cell lines.

      We believe that the ADAM9 experiment is critical to show that the protein associated with Adam13 are not simply the result of overexpressing a different species protein sin ADAM9 is the endogenous protein.*

      Minor comments 1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment. *

      We have corrected this* 2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc. *

      We have corrected this in the legend

      • Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.*

      The volcano plot is from MS/MS not RNAseq. We have list of all of the genes and/or proteins corresponding to each figure in tables

      We now have a figure from the RNAseq and a subset of genes of interest are show. *4. Why use the flag tag in Figure 5? *

      We used Flag-tagged construct to only immunoprecipitated the variants and not the endogenous TFPA2a in these experiments. Also we used RFP-Flag to eliminate any protein that bound to the tag or the antibody.

      This figure is no longer in the manuscript.* 5. Is the data in figure 4A-D the same as Supp. Figure 4A-D? *

      These are independent biological replicates of the same experiment.* 6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family". *

      We clearly have missed some, we are using italicized for gene, and regular for proteins. It might not be clear in the text when we are referring to genes and proteins. We will correct this in the rewrite. 7. Please review the manuscript for grammatical and typographical errors. * We have used all available software including Word and Grammarly. We will try to improve on the next version. **Cross-commenting**

      I think the two reviewers on one the same page on this manuscript.

      Reviewer #2 (Significance (Required)):

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators.*

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below:

      Major comments:

      1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow.
      2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. Similarly the Arid3a expression in Supp Figure 1 if anything seems increased. It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC?
      3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others. Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here.
      4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. How do you tell if the interactions are changed from this? The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants?
      5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. Because these are in cell lines, this should be more consistent. In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay.
      6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion. The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration.
      7. The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.
      8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript

      Minor comments

      1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment.
      2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc.
      3. Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.
      4. Why use the flag tag in Figure 5?
      5. Is the data in figure 4A-D the same as Supp. Figure 4A-D?
      6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family".
      7. Please review the manuscript for grammatical and typographical errors.

      Cross-commenting

      I think the two reviewers on one the same page on this manuscript.

      Significance

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Pandey et al. show that the ADAM13 protein modulates histone modifications in cranical neural crest and that the Arid3a protein binds the Tfap2a promoter in an Adam13-dependent manner and has promoter-specific effects on transcription. Furthermore, they show that the Adam13 and human ADAM9 proteins associated with histone modifiers as well as proteins involved in RNA splicing.

      Although the manuscript is mostly clearly written and the figures well assembled, it reads like a couple of separate and unfinished stories. They show using immunocytochemistry and qPCR that ADAM13 knockouts in CNCs afffects histone modifications. Here ChIP-seq or Cut-n-Run experiments would be more appropriate and would result in a more comprehensive understanding of the changes mediated. The immunohistochemistry assays should at least be verified further using western blotting or other more quantiative methods. The authors then show that ADAM13 interacts with a number of histone modifiers such as KDM3B, KDM4B and KMT2A but strangely they do not follow up this interesting observation to map the interactions further (apart from a co-ip with KMT2A), the domains involved, the functional role of the interactions or how they mediate the changes in chromatin modifications.

      The authors then show that ADAM13 affects expression of the TFAP2a gene in a promoter specific manner - affecting expression from S1 but not S2. They further show that ADAM13 affects the binding of the Arid3 transcription fator to the S1-promoter but not to the S3 promoter. However, ADAM13 was present at both promoters. Absence of ADAM13 resulted in increased H3K9me2/3 and decreased H3K4me3 at the S1 promoter whereas only H3K4me3 was changed at the S2 promoter. Unfortunately, they do not show how this is mediated or through which binding elements this takes place. Why is ADAM13 present at both promoters but only affects Arid3 binding at S1? The authors claim that transfecting Arid3a and Adam13 together further increases expression from a reporter (Fig 4E) but this is not true as no statistical comparison is done between the singly transfected and double transfected cells.

      Then the authors surprisingly start investigating association of proteins with the two isoforms of TFAP2a which in the mind of this reviewer is a different question entirely. They find a number of proteins involved in splicing. And the observation that ADAM13 also interacts with splicing factors is really irrelevant in terms of the story that they are trying to tell. Transcription regulation and splicing are different processes and although both affect the final outcome, mRNA, they need to be investigated separately. The link is at least not very clear from the manuscript. Again, the effects on splicing are not further investigated through functional analysis and as presented the data presented is too open-ended and lacking in clarity.

      Additional points:

      1. In the abstract they propose that the ADAMs may act as extracellular sensors. This is not substantiated by the results.
      2. Page 5, line 16: what is referred to by 6 samples 897 proteins? Were 6 samples analyzed for each condition? The number of repeats for the mass spec analysis is not clear from the text nor are the statistical parameters used to analyse the data. This is also true for the mass spec presented in the part on TFAP2aL-S1 and Adam13 regulate splicing. Statistics and repeats are not presented.
      3. Page 6, line 19: set domain should be SET domain.
      4. The number of repeats in the RNA sequencing of the CNCs is not clear from the text.
      5. The explanation of Figure C is a bit lacking. There are two forms of TFAP2a, L and S, but only one is presented in the figure. Do both forms have the extra S1-3 exons? Also, at the top of the figure it is not clear that the boxes are part of a continuous DNA sequence. Also, it is not clear which codon is not coding.
      6. In the sashimi plot there are green and pink shaded areas. What do they denote? What exactly is lacking in the MO13 mutant - seems that a particular exon is missing suggesting skipping?
      7. Page 11, line 9: „with either MbC or MbC and MO13" needs to be rephrased.
      8. Page 11, line 19: „the c-terminus of....and S3) and" should be „the C-terminus of...and S3 and".
      9. Page 15, line 10: substrateS
      10. Page 16, line 23: the sentence „increases H3K9 to the promoter of the most upstream" needs revision.
      11. Page 26, line 12: Here the authors say: „for two samples two-tail unpaired". What does this mean? Statistics should not be performed on fewer than three samples. In legnd to Figure 6 it indicates that T-test was performed on two samples.
      12. The discussion should be shortened and simplified.
      13. Figure 1 legend. How many images were quantitated for each condition?
      14. Figure 2 has a strange order of panels where G is below B.
      15. Figure 6 legend, line 12. „proteins that were significantly enriched in either of the 2 samples" is not very clear. What exactly does this mean?

      Significance

      If the authors follow up on either the transcription-part of the story, or the splicing part of the story, they are likely to have important results to present. However, in the present format the paper is lacking in focus as both issues are mixed together without a clear end-result.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1, point 1: In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      The size of each sample quantified, given as number of ommatidia/number of retinas, is indicated in the figure legends. This must have escaped the attention of reviewer 1, so we have added a sentence in the legend of Fig. 2 to state it more clearly. We think that the figure legends are the best place to put this information for ease of comparison to the figures.

      *Reviewer 1, point 2: To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns? *

      We will generate clones of cells that over-express Kkv in either central cells (cone and primary pigment cells) or lattice cells (secondary and tertiary pigment cells), using the same drivers that we used to over-express Reb, and will examine chitin secretion at 54 h after puparium formation (APF) and in adults.

      As there are no available mutations in Chitin synthase 2 (Chs2), we will knock it down with RNAi in all retinal cells using lGMR-GAL4 and look for corneal lens defects. However, we think that Chs2 is unlikely to contribute chitin to the corneal lens, because its expression is restricted to the digestive system, and because kkv knockdown essentially eliminates chitin from the corneal lens.

      *Reviewer 1, point 3: Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation? *

      In dyl mutants, chitin deposition is delayed, but it does accumulate later in development, so the phenotype is different from kkv mutants. We have clarified this in the manuscript (p. 6). To address the other points, we will examine the expression of Dyl and of Dumpy-YFP in mid-pupal and late pupal retinas in which kkv is knocked down in all cells with lGMR-GAL4. The ZP protein matrix is originally deposited before chitin secretion begins, so we will examine whether loss of chitin affects its later maintenance.

      *Reviewer 1, point 4: What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins (Obst-A, Obst-C (Gasp), Knk and others) impact lens formation. *

      Adult corneal lenses derived from kkv knockdown retinas do not contain chitin, but there is remaining corneal lens material. We do not think that this is the ZP domain matrix, as this is normally lost in late pupal development, but we will check whether Dpy-YFP is retained in kkv knockdown adults. We will try to detect Obst-A and Gasp proteins using available antibodies. However, this may not be successful, as we have found that antibodies do not penetrate the corneal lens well. Our transcriptomic studies have identified numerous secreted proteins that are expressed at high levels in the mid-pupal retina and could be components of the corneal lens. We may be able to detect some of these using fluorescently tagged forms, but it is possible that the currently available tools will not be sufficient to answer this question.

      We have begun to work on how some of these proteins affect corneal lens structure, but this will take a significant amount of time and we think it would work better as a separate manuscript. We see our current manuscript as a short and focused story about the importance of the source of chitin in determining corneal lens shape.

      *Reviewer 1, minor comment 1: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development. Fig -1A' please label the cone cells and pigment cells. *

      We have labeled these cells in Fig. 1A’’.

      *Reviewer 1, minor comment 2: Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3. *

      We have defined the abbreviations in the figure legend. Fig. 1H did show the corneal lens situation before, during and after chitin secretion, but we have added the cone and pigment cells to the 72 h APF and adult diagrams to make them more meaningful (now Fig. 1I).

      *Reviewer 1, minor comment 3: Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier. *

      We think that the reviewer is asking when the chitin first starts to form a dome shape. We have added an orthogonal view of chitin in a 54 h APF retina viewed with LIGHTNING microscopy, showing that the external curvature is already present at this stage (new Fig. 1F).

      *Reviewer 1, minor comment 4: Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not *

      Fig. 2E shows part of a retina in which kkv has been knocked down in all cells, so none of the corneal lenses contain chitin. We have clarified this in the legend to Fig. 2.

      *Reviewer 1, minor comment 5: Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement. *

      We were referring to the double knockdown, which Fig. 2L, M show is significant, and not to the single knockdowns quantified in Fig. S1. We have clarified this in the text.

      *Reviewer 1, minor comment 6: Fig.2 and Fig. S1: what is Chp (Chaoptin)? *

      We have stated in the legend to Fig. 2 that Chaoptin is a component of photoreceptor rhabdomeres.

      *Reviewer 1, minor comment 7: Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells? *

      Chitin is still present in the mechanosensory bristles in Fig.S1I, as these do not express lGMR-GAL4. We have stated this in the figure legend.

      *Reviewer 1, minor comment 8: Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards? *

      The double knockdown of exp and reb has a more significant effect on the adult corneal lens outer angle than the single exp knockdown, even though the exp knockdown lacks chitin at 54 h APF. We believe that this is because Reb is sufficient for some chitin synthesis at later stages of development. This was mentioned in the text (p. 6) and we have added further clarification in the legend to Fig. S1.

      *Reviewer 1, minor comment 9: Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number? *

      We have added a high magnification image of a mosaic ommatidium with one wild-type and one kkv knockdown edge, showing the region at the edge of the corneal lens in which chitin fluorescence was quantified and the central region used for the normalization (Fig. 3I). The sample numbers are given in the legend to Fig. S2D.

      Reviewer 1, minor comment 10: Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf*). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion. *

      We have added a discussion of these points and papers to the text (p. 6 and 9). Although we are not specifically addressing differences between the inner and outer parts of the corneal lens in this manuscript, we have now included a high-resolution LIGHTNING image showing how the layered structure of the corneal lens is affected when chitin production by central cells is increased (Fig. 4F).

      *Reviewer 2, point 1: Adult corneal lenses lacking chitin still form a thin structure in kkv RNAi. The authors suggest that this may be due to the presence of the ZP domain proteins Dyl, Dpy and Pio. Immunostaining for these ZP domain proteins could provide supporting evidence. *

      To clarify, we meant to say that the earlier presence of the ZP domain matrix could retain components other than chitin in the corneal lens. The ZP domain proteins are no longer present in the adult. We have made this clearer in the text. As described under reviewer 1, points 3 and 4, we will examine Dyl and Dpy-YFP expression in kkv knockdown retinas at mid-pupal and adult stages, and we will also look at the expression of another ZP domain protein, Piopio.

      *Reviewer 2, minor comment 1: At 50 h APF, Kkv (Fig. 2B, B') and Reb (Fig. S1A, A') appear to be expressed at higher levels in lattice cells than in central cells, even though chitin is mainly present in the central cells at this time (Fig. 1B-B'). Discuss possible explanation for their expression pattern and their roles at this stage. *

      We agree that this is a surprising result. We have added a discussion of possible explanations, such as the lack of another component necessary for chitin secretion in lattice cells at this stage, or the presence of high levels of chitinases (p. 7).

      *Reviewer 2, minor comment 2: Fig. 1F and G: Indicate that the cryosection images represent single ommatidia, and label "external" and "internal" to help orient readers. *

      We have made these changes to the figure panels (now G and H), and indicated in the legend that they are single ommatidia.

      *Reviewer 2, minor comment 3: Figure 2. The cartoon diagram showing the angle measurement (currently Fig S1K) should be moved to the main figure to help readers understand the quantifications. *

      We have moved this diagram to Figure 2L.

      *Reviewer 2, minor comment 4: Figure 3H. It would be helpful to clearly mark the edge of the corneal lens in the chitin intensity image. *

      As described under reviewer 1, minor comment 9, we have added a high magnification picture showing the edge region used for chitin quantification (Fig. 3I), which should also address reviewer 2’s concern.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Ghosh and Treisman demonstrates that localized chitin secretion shapes the Drosophila corneal lens. Building on their previous work showing that zona pellucida domain proteins influence corneal lens architecture, which correlates with a delay in chitin deposition, the authors investigate how chitin and its secretion contribute to defining the lens curvature. Through cell-type-specific RNAi and overexpression experiments combined with beautiful imaging and quantifications, they provide convincing evidence that the central cone and primary pigment cells are the principal sources of chitin in the middle region of the corneal lens. Overall, this study offers strong evidence that localized chitin secretion and restricted diffusion underlie the precise shaping of the corneal lens. I have only one major comment and a few relatively minor suggestions to improve clarity.

      Major comments:

      • Adult corneal lenses lacking chitin still form a thin structure in kkv RNAi. The authors suggest that this may be due to the presence of the ZP domain proteins Dyl, Dpy and Pio. Immunostaining for these ZP domain proteins could provide supporting evidence.

      Minor comments:

      • At 50 h APF, Kkv (Fig. 2B, B') and Reb (Fig. S1A, A') appear to be expressed at higher levels in lattice cells than in central cells, even though chitin is mainly present in the central cells at this time (Fig. 1B-B'). Discuss possible explanation for their expression pattern and their roles at this stage.
      • Fig. 1F and G: Indicate that the cryosection images represent single ommatidia, and label "external" and "internal" to help orient readers.
      • Figure 2. The cartoon diagram showing the angle measurement (currently Fig S1K) should be moved to the main figure to help readers understand the quantifications.
      • Figure 3H. It would be helpful to clearly mark the edge of the corneal lens in the chitin intensity image.

      Significance

      This study provides novel insights into how the differential secretion of a polysaccharide determines the curvature of a complex optical structure. The elegant use of cell-type-specific genetic manipulations, together with high-quality imaging and rigorous quantification is the key strength.

      The study advances our understanding of how chitin secretion and limited diffusion shape apical ECM structures during tissue morphogenesis. It also extends findings from the tracheal and cuticular chitin systems into a new optical context.

      The manuscript will be of interest to developmental biologists, particularly those studying epithelial tissue morphogenesis and apical ECM organization.

      I have expertise in Drosophila epithelial morphogenesis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Chitin plays a crucial role in the morphogenesis of the Drosophila corneal lens by supporting the structural integrity and biconvex shape of the lens. The Drosophila corneal lens is a biconvex structure that focuses light. Chitin, a major component, is produced mainly by the central cone and primary pigment cells. The production and arrangement of chitin by central cells directly impacts the thickness and curvature of the lens. Adequate chitin secretion is necessary to ensure the correct shape and function of the corneal lens, while disturbances in chitin production can lead to deformed lenses. Blocking chitin synthesis leads to a significant reduction in chitin deposition in the corneal lens, resulting in a thinner and deformed lens. In particular, the corneal lens shows reduced outer and inner curvature, which compromises its biconvex shape. These changes in chitin production and arrangement result in abnormal morphology of the corneal lens in the adult stage. The key messages of the paper's results are: The Drosophila corneal lens is a biconvex structure that focuses light. 2.) chitin, a significant component, is produced mainly by central cells (cone and primary pigment cells). 3.) Downregulation of the chitin synthase gene Krotzkopf reduces lens thickness and curvature. 4.) Overexpression of Rebuf increases chitin secretion and lens thickness. 5.) Localized chitin secretion is crucial for the typical shape of the corneal lens.

      Comments

      Main comments

      The manuscript provides an exciting insight into how the formation of the lens is regulated by the secretion of chitin. However, the data set appears to have shortcomings that must be considered for the next steps. 1.) In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      2.) To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns?

      3.) Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation?

      4.) What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins impact lens formation.

      Minor comments:

      Page 6: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development.

      Fig -1A' please label the cone cells and pigment cells.

      Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3.

      Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier.

      Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not

      Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement.

      Fig.2 and Fig. S1: what is Chp (Chaoptin)?

      Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells?

      Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards?

      Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number?

      Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion.

      Significance

      The manuscript's strength and most important aspects are the genetic expression, and localization studies of the chitin under control of the chitin synthase kkv, reb and exp in Drosophila pupal and adult eye . However, beyond this manuscript, the development of mechanistic details, such as interaction partners that trigger secretion and action at the ZP matrix and adjacent apical membranes will be interesting.

      The manuscript uses nice genetics tools to describe the Chitin secretion differences in Drosophila eye and their specific impact on corneal lens formation. Such a precise molecular analysis has not been investigated before in insects. Therefore, the study deeply extends knowledge about the role of Chitin synthases and chitin secretion in insect eye.

      The audience will not only rather specialized in basic research in zoology, developmental biology, and cell biology in terms of how the Chitin synthases produce chitin. Nevertheless, as chitin is relevant to material research and medical and immunological aspects, the manuscript will be interesting beyond the specific field and thus for a broader audience.

      I'm working on chitin in the tracheal system and epidermis in Drosophila.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Leguay et al present an interesting and logical series studies that investigate the activity and signaling of the GPCR TBXA2R in TNBC cells. The premise of the overall study is that metastasis is often associated with a more invasive/motile cancer cell phenotype. The investigators have an interest in ERM (Ezrin, Radixin, Moesin) proteins, which have been implicated in cell motility. The authors link stimulation of TBXAR2, a GPCR, to activation of ERM proteins and also show that TBXAR2 is associated with worse outcome in TNBC patients. Through the use of genetic and pharmacologic tools the authors provide convincing biochemical and cell based data to support their model that stimulation of TBXAR2 activates Gα11 & Gα12/13 which subsequently stimulate RhoA and SLK/LOK which then phosphorylate ERMs. The authors show relevant biologic consequences of the pathway. Data include orthogonal assays with similar results and the manuscript is written clearly and the data are displayed well. Overall it is a solid story that is largely well done. There are a few comments that should be addressed.

      Comments:

      1. All the biochemical/cell based in vitro data exploit the use of small molecule agonists of TBXAR2, not the natural ligand. A comment on this and why use of TXA2 is not feasible would be helpful to the reader.
      2. The data in figures 1-5 are solid and clear. However, I suggest adding a higher magnification inset for the IHC images shown in Fig 3E. It would be useful to be able to distinguish cells in the IHC, a higher mag shot should suffice.
      3. A) The use of Hs578t cells for the in vivo modeling is unfortunate. Additionally, the use of iv injection to in a study focused on cell invasion is also unfortunate. The metastatic propensity of Hs578t is not clear, in fact a recent report comparing metastasis in breast cancer cell lines shows that Hs578t perform poorly in terms of metastasis after orthotopic injection (see PMID 38468326). I searched the literature a bit to try and find other examples of iv injection of Hs578t cells, I found 1 (PMID:27654855, I did not search exhaustively), this paper shows significant lung metastasis and does not mention liver metastases. Were other breast cancer cells investigated for the in vivo studies?

      B) Why I was interested is because the typical organ that is seeded post iv injection is the lungs (as seen in the above ref), liver metastases post iv injection are not common, especially with breast cancer cells. What did the lungs look like in your experiments?

      C) Further while the data presented in figure 6 are supportive of the overall conclusions, the data is modest at best in terms of metastatic burden. Repetition of the experiment using a breast cancer cell line injected orthotopically would likely be more useful in highlighting the importance of the pathway to metastasis. <br /> I understand performing an orthotopic assay may be outside the scope of the study, but it would provide greater impact given the focus of the paper on cell invasion.

      Cross-commenting

      I think reviewer comments are generally aligned. I was least critical but appreciate the concerns of the other reviewers, especially rev #1 who requested additional validation and controls. In my opinion in vivo studies are not robust, I expect that is due to cell line choice. Repetition of the in vivo study with a breast cancer cell line that is capable of metastasis (from a primary tumor) would be more effective.

      Significance

      The manuscript presents a solid, logical flow and the biochemical/cell based in vitro data are clean. Clear differences between groups, appropriate controls, and displayed effectively.

      The challenge is the in vivo study. IV injection of cancer cells is a valid model for seeding and growing in a target organ BUT it does not reflect cell invasion, which is typically thought of as a step that occurs earlier in the metastatic cascade. That said, the data are supportive with conclusions but not necessarily consistent with expected results based on iv injection of this cell line. A caveat is that the cell line used is characterized as having metastatic characteristics in vitro but is not a consistent metastatic line in vivo. The recommendation is the perform a new in vivo experiment. An orthotopic injection of a strongly metastatic cell line, such as MDA MB 231 or other (see paper ref aboved) would be a more stringent and accurate test of the importance of the pathway to cell invasion in vivo.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    6. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    7. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Leguay et al present an interesting and logical series studies that investigate the activity and signaling of the GPCR TBXA2R in TNBC cells. The premise of the overall study is that metastasis is often associated with a more invasive/motile cancer cell phenotype. The investigators have an interest in ERM (Ezrin, Radixin, Moesin) proteins, which have been implicated in cell motility. The authors link stimulation of TBXAR2, a GPCR, to activation of ERM proteins and also show that TBXAR2 is associated with worse outcome in TNBC patients. Through the use of genetic and pharmacologic tools the authors provide convincing biochemical and cell based data to support their model that stimulation of TBXAR2 activates Gα11 & Gα12/13 which subsequently stimulate RhoA and SLK/LOK which then phosphorylate ERMs. The authors show relevant biologic consequences of the pathway. Data include orthogonal assays with similar results and the manuscript is written clearly and the data are displayed well. Overall it is a solid story that is largely well done. There are a few comments that should be addressed.

      Comments:

      1. All the biochemical/cell based in vitro data exploit the use of small molecule agonists of TBXAR2, not the natural ligand. A comment on this and why use of TXA2 is not feasible would be helpful to the reader.
      2. The data in figures 1-5 are solid and clear. However, I suggest adding a higher magnification inset for the IHC images shown in Fig 3E. It would be useful to be able to distinguish cells in the IHC, a higher mag shot should suffice.
      3. A) The use of Hs578t cells for the in vivo modeling is unfortunate. Additionally, the use of iv injection to in a study focused on cell invasion is also unfortunate. The metastatic propensity of Hs578t is not clear, in fact a recent report comparing metastasis in breast cancer cell lines shows that Hs578t perform poorly in terms of metastasis after orthotopic injection (see PMID 38468326). I searched the literature a bit to try and find other examples of iv injection of Hs578t cells, I found 1 (PMID:27654855, I did not search exhaustively), this paper shows significant lung metastasis and does not mention liver metastases. Were other breast cancer cells investigated for the in vivo studies?

      B) Why I was interested is because the typical organ that is seeded post iv injection is the lungs (as seen in the above ref), liver metastases post iv injection are not common, especially with breast cancer cells. What did the lungs look like in your experiments?

      C) Further while the data presented in figure 6 are supportive of the overall conclusions, the data is modest at best in terms of metastatic burden. Repetition of the experiment using a breast cancer cell line injected orthotopically would likely be more useful in highlighting the importance of the pathway to metastasis. <br /> I understand performing an orthotopic assay may be outside the scope of the study, but it would provide greater impact given the focus of the paper on cell invasion.

      Cross-commenting

      I think reviewer comments are generally aligned. I was least critical but appreciate the concerns of the other reviewers, especially rev #1 who requested additional validation and controls. In my opinion in vivo studies are not robust, I expect that is due to cell line choice. Repetition of the in vivo study with a breast cancer cell line that is capable of metastasis (from a primary tumor) would be more effective.

      Significance

      The manuscript presents a solid, logical flow and the biochemical/cell based in vitro data are clean. Clear differences between groups, appropriate controls, and displayed effectively.

      The challenge is the in vivo study. IV injection of cancer cells is a valid model for seeding and growing in a target organ BUT it does not reflect cell invasion, which is typically thought of as a step that occurs earlier in the metastatic cascade. That said, the data are supportive with conclusions but not necessarily consistent with expected results based on iv injection of this cell line. A caveat is that the cell line used is characterized as having metastatic characteristics in vitro but is not a consistent metastatic line in vivo. The recommendation is the perform a new in vivo experiment. An orthotopic injection of a strongly metastatic cell line, such as MDA MB 231 or other (see paper ref aboved) would be a more stringent and accurate test of the importance of the pathway to cell invasion in vivo.

    8. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The article titled 'Non-Invasive Mechanical-Functional Analysis of Individual Liver Mitochondria by Atomic Force Microscopy' discusses how the mechanical properties of mitochondria as response to various drugs, like CCCP and ADP or rotenon and antimycin A.

      Key findings:

      The Authors correlated the thermal noise power spectrum (PSD) measured in contact on the top of mitochondria using atomic force microscope (AFM) with membrane activity of the organelles measured using fluorescence markers.

      They identified correlation trends between PSD, height, elasticity and fluorecence marker intensities for various cases, where the organelle activity was modified using drugs or genetic changes. The work is a very interesting approach, an excellent application of mechanobiology to gain further understanding of the properties of the energy producing organelles of eukaryotes. However, the overall results the Authors present have some serious flaws.

      I would recommend for publication after significant changes were made.

      Major comments:

      Upon measuring the power spectrum density (PSD) of thermal fluctuations in contact of an organelle, there are several factors influencing the measurements, such as: spatial inhomogeneity of the mitochondrion, the loading force applied, the feedback system of the AFM, hydrodynamic drag of the media on the cantilever.

      None of the above points are addressed in the manuscript. That is:

      • what was the spatial variability of the signal on the top of the organelle? (Using a tip with 30 nm apex radius has a relatively high variability even in microscopically homogeneous systems)
      • what was the loading force applied, and how did the PSD vary with the loading force?
      • according to the text on the bottom of page 5 the feedback was ON. How did this influence the recorded PSD? Significance of differences between organelles can be only properly estimated in relation to the spatial and load dependence of the same information.

      Minor comments:

      • Numerical Fourier transform generating the PSD is very noise prone, thus many curves need to be averaged for a good result. Please provide statistical information on this aspect of the obtained curves.
      • In the text it is mentioned that characteristic changes of the PSD were observerd. What are the characteristic changes between unperturbed and drog affected mitochondria? Please highlight them on the graphs of PSD.
      • How is the distribution of the results e.g. in Figure 1.E? Histogram and box-plots are more informative than bar plots.
      • How many curves were recorded for the individual mitochrondia? (30 mitochrondia were measured)
      • Figure 2.A and Figure S1.C indicate nicely how heterogeneous the mitochondria are. How did you eliminate the corresponding error from the PSD measurements?
      • To highlight correlations, simple plots of the parameters as the function of each-other can be very informative.
      • On Figure 1, the correlation between the fluorescence intensities and the PSD integrals are only qualitative.
      • On Figure 3 the inverste correlation between the height and Young's modulus is not clear. Can it be plot such a way that the intended information becomes clear?
      • While the Authors are claiming that the PSD is charactersitic to the mechanical properties of the organelles, its direct connection remains elusive and is not discussed in the paper. Again, loading force dependence is expected to be present and influence whether the probe is detecting changes in membrane properties or sense something deeper, structures under the membrane.
      • While the Authors correlate various measures derived from AFM data, these are only ensemble comparisons, since imaging and PSD measurements were done using different AFMs, thus different sample points. This should be clearly stated in the text.
      • QI mode is very robust for imaging, but its Young's moduli are difficult to compare to any real situation, since the measurement si performed typically at the 500 - 2000 Hz frequency range. Not mentioning that the individual force curves are usually rather noisy for biological samples.
      • In Figure S1.B, nothing is visible for the CCCP sample.
      • In Figure S2, what does the value of 300 means for alpha in the first sentence?
      • While the frequency dependence of the PSD makes sense, the data indicated in figure S2 also indicates very high noise, making the fits unreliable. What would be the exponent value in the 5% - 95% confidence interval?
      • It may be also informative to see a common plot of individual PSDs for the various cases, and in the representative plot see mean +/- SE plots for each frequency points.
      • In the experiment description stands: 'Bruker Multimode AFM was used for overall imaging and power spectra in tapping mode.' This is misleading, because in tapping mode the end of the cantilever is driven by a constant frequency, which would interfere with the thermal PSD measurement. If it was done so, this is a driven state which should be discussed, and which is also dependent on the driving frequency.
      • When preparing the PLL surfaces, how were the mica substrates washed before adding the organelles?
      • The topography images are most probably measured Z-piezo sensor outputs. However, this is not mentioned.
      • Imaging conditions of QI mode are incomplete the point measurement frequency, parameter to the apparent Young's modulus is not mentioned.

      Referee cross-commenting

      Reading the review of Reviewer 1 highlights the flaws in the organelle biology part of the work I was not aware of. (I am expert in mechanical characterization in the molecular - cellular level.) Putting the reviews together highlights that this study is in a very early state of investigation. It would be really interesting to see its results, but claiming it to be a novel diagnosis tool may be far fetched. (I agree with Referee 1.)

      Significance

      In general, the idea of estimating the mechanical properties of mitochondria and correlate them to the activity of the organelles is a very interesting idea in the field of mechanobiology. The Authors have done a relatively large amount of experiments to identify correlation between activity followed by more traditional fluorescence labels and the AFM data they generated. They performed many experiments spanning also three AFM devices and other experimental methods in their work.

      Limitations:

      I believe however, they missed some key points influencing their results, most importantly the dependence of the data on the:

      • normal loading force
      • spatial inhomogeneity (their own images prove the presence of this)

      I am afraid some of the effects they detect are not only qualitative, but also biased, but with the current figures and data I cannot substantiate.

      Audience: specific to microbiology, especially the audience interested in mechanobiology

      I believe this is an interesting work, and contributes to our understanding of micromechanics at the organelle level. Thus I would really like to see it published in a more complete form.

      Advance: Mitochondria is known to respond to environmental clues and can remodel its internal structure in response to stresses. However, it is difficult to find studies on the individual mechanical properties of these organelles, even in ex-situ environments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the article "Non-Invasive Mechanical-Functional Analysis of Individual Liver Mitochondria by Atomic Force Microscopy", O. Zorikova and colleagues propose the use of Atomic Force Microscopy (AFM) as a tool for characterizing the biophysical properties of individual mitochondria. By analyzing parameters such as height, membrane fluctuation power spectra, and Young's modulus under various drug treatments and genetic mutations, the authors aim to provide a novel, label-free method for assessing mitochondrial functionality.

      While the manuscript presents an interesting approach, the introduction would benefit from a clearer and more cohesive narrative. The authors highlight the need to monitor the function of individual mitochondria, which is indeed an important challenge, but the rationale for doing so should be more explicitly stated. A stronger emphasis on the biological importance of mitochondrial biophysical parameters and the added value of using AFM would enhance the motivation for the study. Additionally, the symbol Δψ, referring to mitochondrial membrane potential, should be defined and briefly explained in the introduction for clarity.

      In the results section, a schematic diagram of the experiment would aid comprehension, especially for readers less familiar with this technique. In general, in the figures it would be good to find the individual data points. The integration of the results into the main text could also be improved. Currently, several findings are presented in a descriptive manner, but the biological interpretation or relevance is not always clear. For example, the sentence "Figure 2 presents a comprehensive analysis of the height and elastic properties of mitochondria" could be expanded to explain what those findings actually mean and how they help support the main goal of the study. Similarly, the statement that "the integrated power of mitochondrial membrane fluctuations decreased significantly upon valinomycin treatment" is presented without explanation of what this metric represents or why valinomycin was chosen. When discussing MTH2, the authors refer to "mechanical alterations in mitochondria lacking this protein" without explaining what MTH2 is, where it is localized, or why it is biologically relevant.

      Finally, in the discussion, the interpretation of results could be expanded. For example, the statement "MKO/MLM exhibited increased integrated power/potential, increased modulus/stiffness, and decreased height" would benefit from more biological context - what do these changes imply about mitochondrial function or physiology? Adding this kind of interpretation would help the reader better understand the broader significance of the findings.

      Methods: The authors say they record the piezo movement but it is not clear to the reviewer if the authors perform a closed-loop force-feedback experiment. If so, this will introduce noise into the measurement which can be avoid by performing an open loop measurement. Why did the authors not record the cantilever fluctuation at a constant piezo height? This gives enough bandwidth and low noise to record Angstrom deflections. Likewise, it is unclear to this reviewer why the power spectrum is given in V and not in nm, as it is typical in AFM measurements. I assume the authors calibrated the deflection sensitivity and spring constant of the cantilever, hence, if possible, the authors should convert the PSD into nm/Hz.

      During the elasticity measurements, did the authors correct for the finite thickness of the mitochondria? What was the contact force and indentation depth, and how thick were the mitochondria to begin with? If the indentation is larger than 20%, I suggest to perform a correction to account for the infinite stiffness of the substrate. Given that the mitochondrial stiffness is in the tens of kPa, this seems to be important (perhaps not for relative values but for absolute stiffness measurements).

      Figures. The figures are well constructed and aid the reader through the important messages of the paper. The authors however, should not excessively overuse bar charts without explicitly mentioning number of measurements for each condition. In essence, I strongly recommend plotting individual data points to see the distribution and replace the stars with actual p-values.

      Significance

      The premise of the study is compelling and could have important clinical implications for distinguishing dysfunctional mitochondria in pathological contexts. However, the manuscript in its current should be improved. First of all, non-invasive is more than an euphemism, as the mitochondria need to be taken out of the cell, which is highly invasive. The authors should delete non-invase from the title.

      As the work presents an orthogonal and non-standard approach, the authors introduced a novel assay that can guide future investigations into the biophysics of mitochondrial physiology. Thus the paper is of high interest, timely and cutting edge.

      In summary, the study presents a promising approach with potentially high relevance for mitochondrial research.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use atomic force microscopy (AFM) to study mitochondria isolated from primary mouse livers, and they attempt to correlate these measurements with mitochondrial membrane potential and oxygen consumption under different bioenergetic conditions. They argue that AFM could be used diagnostically to assess mitochondrial function. While there is some novelty in potentially using AFM to assess mitochondrial function in the clinic, it is not clear how this would be more efficient or meaningful that assessing mitochondrial parameters by more standard methods, such as respirometry, confocal microscopy, etc. Considerably more work would need to be performed, particularly on relevant patient samples, to show that AFM holds potential as a diagnostic tool. It is important to note that the authors of this study have not taken sufficient care to quantify the mitochondrial membrane potential in a manner that could be considered reliable, which casts further doubt upon the merits of this method for diagnosing mitochondrial function. These concerns, laid out in detail below, should be thoroughly addressed before publication.

      Major comments:

      The authors used azide to inhibit complex V, but azide is also a potent inhibitor of complex IV (Bowler et al., 2006). Why did the authors not use oligomycin, which is more specific, to inhibit complex V? In Fig. 1 H - K, the y axes are labelled in a confusing or ambiguous way. The legend says that all data represent the mean {plus minus} SEM; however, panels D, F, H, and K have no error bars. For example, the data in H and K are shown as violin plots. Typically, the y axis would say what the name of the quantity is (e.g., mean TMRM fluorescence intensity) followed by the units (e.g., a.u.) in parentheses. However, the authors write, for example, in panel K "Mean pixel (TMRM)." The authors seem to follow the correct convention in panels D - G, so it is not clear why H - K are written incorrectly. In any event, the authors need to specify how these data were obtained, as there are virtually no details as to the methods of how these measurements of mitochondrial membrane potential were acquired. For example, JC-1 is a ratiometric probe. In its monomeric form, it emits a green signal, but, as the dye aggregates into so-called J-aggregates, the emission is red. The correct way of analyzing JC-1 signal is to compute the ratio of red over green fluorescence intensity. However, in the authors' quantifications, they simply say "Fluorescence (JC-1)." The units of the y axes go from zero to 20,000, which means that the authors likely did not assess the ratio of these emissions, so the data are not informative as to the actual mitochondrial membrane potential. Moreover, the authors indicate that they use 5 µM JC-1. This seems quite a high concentration, particularly for staining isolated mitochondria, which means that the dye has direct access to the organelle without having to cross the plasma membrane. There is no information about how long the dye was allowed to load and whether it was washed off prior to obtaining the measurements with the plate reader. Likewise, the authors used TMRM to also try to assess the mitochondrial membrane potential. In this case, they used 0.5 µM, but they did not indicate for what duration the mitochondria were exposed to the dye before going through the FACS. It should be noted, too, that TMRM is a Nernstian probe, which effectively stains mitochondria at concentrations as low is 1 nM. Accordingly, it is known that TMRM (and other mitochondrial dyes) can be toxic at higher concentrations, inhibiting essential processes such as OXPHOS. The very low dynamic range of the TMRM signal in panels H and K suggest that the signal was saturated, because there was too much dye loaded into the mitochondria. Moreover, the values, ranging merely from zero to 80 suggest a very insensitive method for quantifying the mitochondrial membrane potential. In Fig. S1 A-B, the authors used confocal microscopy to assess the isolated mitochondria. It would be wise to continue to use this technique for the other experiments, as plate readers and FACS offer no direct visual cues to validate that the numbers reflect bona fide biological measurements. Especially in the case of FACS, where there is an exceedingly large number of events, the statistics become essentially meaningless, as it is possible to show that almost anything is statistically significantly different if there is a sufficiently high number of samples or events. The authors should bear in mind that measuring the mitochondrial membrane potential is not trivial. One needs to understand the properties of the probes that are being employed as well as the instruments that are used to make the measurements. Care must be taken to ascertain that the quantifications reflect true biological processes. The authors claim, for Fig. 1, that there is an "excellent correlation" between height fluctuations and mitochondrial membrane potential. Given that the mitochondrial membrane potential measurements were associated with various errors (see above), it is premature to assert that there is any correlation, at all. Furthermore, if the authors want to argue that there is indeed a correlation between these variables, then they should perform an appropriate statistical analysis, e.g., a pearson correlation coefficient test.

      For the reasons explained above, the JC-1 and TMRM measurements in Figs. 3 and 4 are not convincing. The authors must demonstrate, unambiguously, that they understand the use of these probes and that they are making accurate measurements.

      Given that MTCH2 was recently reported to function as an insertase of the OMM (Guna et al., 2022), understanding the KO phenotype is extremely challenging, since it implicates the downstream loss of function of numerous other proteins. It would be valuable to examine other KO models with more specific mitochondrial defects, which can simplify the interpretation of the data. For example, suppression of any of the large Dynamin GTPases that control mitochondrial shape, i.e., MFN1/2, OPA1, or DRP1. Conversely, modulation of mitochondrial membrane composition by suppression of specific phospholipid biosynthetic enzymes would be valuable. It is important to note that the authors are attempting to highlight AFM as a novel way to assess patient samples, but they do not provide any data as to whether mitochondria, derived from a patient with a known mitochondrial defect, could be meaningfully assessed by this method. It is worth pointing out, too, that isolating mitochondria from primary tissues involves a significant amount of stress to the organelle. To understand mitochondrial function in a manner that reflects an in vivo state as much as possible, it would be essential to show that the isolated mitochondria from the liver are largely the same as those in intact liver cells. The authors should be aware that isolating live hepatocytes is far from a trivial thing to do (Charni-Natan & Goldstein, 2020). Simply mincing the liver and subjecting it to mechanical and enzymatic dissociation likely involves significant mitochondrial stress, which implies that the values derived from isolated mitochondria represent a highly non-physiological, even dysfunctional, condition. These are fundamental concerns which should be considered and discussed in any report that is lauding the potential diagnostic benefits of quantifying isolated mitochondria from primary tissues.

      The authors say, in the discussion, "Accordingly, the AFM method employed here measured several characteristics such as morphology and elastic modulus of the structures, as well as fully exploiting the rich information available from the noise spectra." There was no measurement of "morphology" in this study. Differences in height are not what is generally considered in discussions of mitochondrial morphology, which reflects the dynamic changes in organelle shape and connectivity, typically in the x-y (rather than z) axes.

      The authors performed experiments on fixed and dried mitochondria; however, there is no systematic comparison of the integrated power and other parameters compared to the live mitochondria isolates. This is a key comparison that should have been performed, as it would offer a basic frame of reference for the values of the live organelles. Another key experiment that is lacking in this study is measurement of the same organelle over time to understand the variance in individual organelles from moment to moment.

      Minor comments:

      Generally, the authors should moderate their claims that AFM could be used diagnostically until the above concerns are addressed.

      There needs to be considerably more detail as to the methods that were used here. This is essential insofar as the authors wish to convince potential readers that the experiments were carefully conducted and that the data is reliable. Putting numbers on the margin of the manuscript would be helpful for the referee to specifically address certain points.

      References:

      Bowler MW, Montgomery MG, Leslie AG, Walker JE. How azide inhibits ATP hydrolysis by the F-ATPases. Proc Natl Acad Sci U S A. 2006 Jun 6;103(23):8646-9. doi: 10.1073/pnas.0602915103. Epub 2006 May 25. PMID: 16728506; PMCID: PMC1469772.

      Guna A, Stevens TA, Inglis AJ, Replogle JM, Esantsi TK, Muthukumar G, Shaffer KCL, Wang ML, Pogson AN, Jones JJ, Lomenick B, Chou TF, Weissman JS, Voorhees RM. MTCH2 is a mitochondrial outer membrane protein insertase. Science. 2022 Oct 21;378(6617):317-322. doi: 10.1126/science.add1856. Epub 2022 Oct 20. PMID: 36264797; PMCID: PMC9674023.

      Charni-Natan M, Goldstein I. Protocol for Primary Mouse Hepatocyte Isolation. STAR Protoc. 2020 Aug 13;1(2):100086. doi: 10.1016/j.xpro.2020.100086. PMID: 33111119; PMCID: PMC7580103.

      Significance

      I am an expert in imaging of mitochondria, with considerable direct knowledge of various super-resolution and advanced imaging systems. I have also studied mitochondrial function, using standard biochemical and molecular approaches. I have great familiarity with mitochondrial behavior and dynamics, as understood from live-cell imaging approaches and morphological analysis.

      This study is potentially interesting due to its relatively novel use of AFM to examine mitochondria. However, there is a lot of uncertainty in the measurements due to technical oversights and lack of relevant controls. Whether AFM could be useful in the clinic remains an open question. If the authors could address the comments above, it would go a long way to finding out one way or the other.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      (...) The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      Response: We thank the reviewer for their thoughtful and constructive review of our manuscript. We appreciate the positive comments on our experimentation.

      Major comments

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.

      Response: We restricted ourselves to mapping those diacetylated motifs that can be readily identified by MS2. The characteristic ions of the d3-labeled and endogenous acetylated peptides in the MS2 spectra could not differentiate the diacetylated forms mentioned by the reviewer. Rather than expanding the figure with non-informative rows we amended the legend of figure 3 accordingly "Diacetylated forms K8-K12, K5-K16, K5-K12 could not be distinguished from each other by MS2 and were thus not included in the analysis".

      The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.

      Response: The reviewer is right our statement does properly reflect the data. It rather seems that combinations lacking K12ac are considerably less frequent (K5K8K16 tri-ac, K5K8 di-ac). We now modified the sentence as follows: "Peptides lacking K12ac were less frequent, suggesting that K12 is a primary target".

      Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?

      Response: Unfortunately, NU9056 is very poorly described, neither the mode of interaction with Tip60 nor the mechanism of inhibition are known. The specificity of the chemical has not really been shown, but nevertheless it is used as a selective Tip60 inhibitor in several papers which is why we picked it in the first place. Our conclusions on the inhibitor are in the last paragraph of the discussion: "The fact that acetylation of individual lysines is inhibited with different kinetics argues against a mechanism involving competition with acetyl-CoA, but for an allosteric distortion of the catalytic center." We think that any further interpretation would likely be considered an overstatement.

      Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Fig. 2, it would be useful to have the matched data for H2A.

      Response: In these costly mass spec experiments we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells

      Response: We attempted to monitor changes of histone modifications upon treatment of cells with NU9056 by immunoblotting. Probing H4K5 and K12, the results were variable. We also observed occasionally that acetylation of H4K5 and H4K12 was slightly diminished in whole cell extracts, but not in nuclear extracts. This reminded us that diacetylation of H4 at K5 and K12 is a feature of cytoplasmic H4 in complex with chaperones, a mark that is placed by HAT1 (Aguldo Garcia et al., DOI: 10.1021/acs.jproteome.9b00843; Varga et al., DOI: 10.1038/s41598-019-54497-0). The observed proliferation arrest by NU9056 may thus affect chromatin assembly and indirectly K5K12 acetylation. H4K12 is also acetylated by chameau (Chm).

      We observed a reduction of acetylated H4K16 and H2A.V. H4K16 is not a preferred target of Tip60, but Tip60 acetylates MSL1 and MBDR2, two subunits of the NSL1 complex (Apostolou et al. DOI: 10.1101/2025.07.15.664872). We, therefore, consider that effects on H4 acetylation upon NU9056 treatment may at least partially be affected indirectly. Because we are not confident about the data and because our manuscript emphasizes the direct, intrinsic specificity of Tip60, we refrain from showing the corresponding Western blots.

      You highlight that H2AK10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?

      Response: The reviewer noted an interesting aspect of the evolution of the histone H2A variants. It turns out that H2A.Z is the more ancient variant, from which H2A derived by mutation. H2A.Z/H2A.V sequences are more conserved than H2A sequences. We summarized these evolutionary notions in Baldi and Becker (DOI: 10.1007/s00412-013-0409-x). In the context of the question, this means that mammalian H2A.Z, Drosophila H2A.V and mammalian H2A still contain the ancient sequence (lacking K10), and Drosophila H2A acquired K10 by mutation. The evolutionary advantage associated with this mutation in unclear. We now added a small paragraph summarizing these ideas on page 13 of the (changes tracked in red).

      To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Response: We adjusted the Y-axes in Figure 2 and 3 to facilitate direct comparisons, where such comparison is informative.

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.

      Response: We are grateful for this suggestion and have expanded the abstract accordingly (changes tracked in red).

      Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit 'piccolo' complex. As can be seen by the comparison of the older data (Kiss et al.) and the new data, the 4-subunit TIP60 core complex is a much more potent HAT. We amended the introduction (see marked text) accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion.

      3a. Text references Fig.1E before Fig.1C, please reorder

      Response: We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input."

      3b. Fig.1B/C legend labels appear swapped.

      Response: We thank the reviewer for spotting the swap. We corrected the figure legend.

      3c. Fig.1E, 4A, 4B: add quantification

      Response: We quantified each acetylation level, and added to the relevant panel of Figure 1 and 4 the following phrase: "The quantified levels of each acetylation mark over H3 are shown below each plot." Notably, the difference in acetylation signal strength between the two antibodies highlights the inherent variability of antibody-based detection.

      3d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used.

      Response: The legend of Figure 2A now includes the following sentence. "Peptides that are diacetylated at either K5/K10 or K8/K10 cannot be resolved by MS2. The last row reminds of this fact by the patterning of boxes and displays the combined values."

      Ensure consistent KAT5/TIP60 naming.

      Response: Our naming follows this logic: We use 'Tip60' for the Drosophila protein and 'TIP60' for the Drosophila 'piccolo' or 'core' complexes. The mammalian protein is referred to by the capital acronym TIP60, as is established in the literature. We use KAT5/TIP60 according to the unified nomenclature in the introduction and parts of the discussion, when we refer to the enzymes in more general terms, independent of species. We scrutinized the manuscript again and made a few changes to adhere to the above scheme.

      Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Response: We thank the reviewer for this suggestion that improved the manuscript a lot. We incorporated the first two paragraphs of the discussion into the introduction.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

      Response: Once more we thank the reviewer for their time and efforts devoted to help us improve the manuscript.


      Reviewer #2

      Major comments

      (...) A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Response: We agree with the referee.

      Minor comments

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      1. In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.

      Response: We thank the reviewer for spotting these oversights. We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input." We also swapped the legends.

      For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Response: We appreciate the reviewer's comment and have revised the figure to display individual data points for the two biological replicates instead of error bars, providing a clearer representation of the data distribution. We changed the phrasing 'Error bars represent...' to "Bars represent the mean of two biological replicates (each consisting of two TIP60 core complexes and two nucleosome arrays - each analyzed with two technical replicates), with individual replicate values shown as open circles." and hope that this describes the data better.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

      Response: We thank the reviewer for a thoughtful and constructive review of our manuscript. We appreciate the suggestions that helped to improve the manuscript.


      Reviewer #3

      (...) However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit piccolo complex. As can be seen by the comparison of the older data (ref Kiss) and the new data, the 4-subunit Tip60 core complex is a much more potent HAT. We amended the introduction accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion. Please see the amended text marked in red.

      The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.

      Response: The case of TH1834 is not very strong in the literature, which is why we discontinued the line of experimentation when we did not see any effect of TH1834 (2 different batches) on the preferred substrate. The reviewer's suggestion is very good, but given our limited resources we decided to remove the data and discussion of TH1834 from the manuscript (old Figure 4A). The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

      The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.

      Response: In these costly mass spec experiments, we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Response: We agree with the reviewer that the proposed experiments may be an interesting extension of our current work. However, the Becker lab will be closed down by the end of this year due to retirement, precluding major follow-up studies at this point.

      __ Minor comments: __

      1. Fig. 1 a: instead of "blue residues", would be more accurate to refer to "blue arrows"?

      Response: Yes of course - the text has been revised accordingly.

      Fig.1 b-c: it would be helpful to include which staining (silver/Ponceau?) was performed here.

      Response: The legends now contain the relevant information.

      Fig. 2a: I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.

      Response: We agree and revised text accordingly.

      Fig. 4 c: bar graphs on the top: the X-values are missing.

      Response: The figure has been revised accordingly.

      This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."

      Response: We revised the sentence as follows to improve clarity. "While the replication-dependent H2A is present in most nucleosomes across the genome, H2A.V, the only H2A variant in Drosophila, is incorporated through replication-independent exchange of H2A."

      In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Response: We replaced 'instructive' by 'informative.

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation. The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      Response: We thank the reviewer for their profound assessment and the general appreciation of our work. We agree that the analysis of the TH1834 is not satisfactory at this point and have removed the corresponding data and description from figure 4. The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In their manuscript Krause et al investigate Tip60 selectivity on histone tail acetylation. They use elegant mass spectrometry analysis to analyze lysine acetylation marks and combination of acetylation marks of histone tails of the Tip60 targets H2A, H2A.V and H4. They further consider distinct dynamics by performing a time course experiment and compare Tip60 to MOF. Using these methods, the authors describe interesting and previously undescribed selectivity, dynamics and di-acetylation patterns of Tip60 that will be the starting point of follow-up studies diving into the biological relevance of these findings. Lastly, they investigate the effects of two Tip60 inhibitors and characterize the effects of NU9056 on Tip60 histone tail acetylation in detail. These studies showed that NU9056 has selective effects, impacting some lysine acetylations with greater efficiency than others. As antibodies available to investigate histone acetylations affected by NU9056 are not selective enough, these findings are relevant for any applicant of NU9056.

      However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern
      2. The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.
      3. The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.
      4. The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Minor comments:

      1. Fig. 1 a): instead of "blue residues", would be more accurate to refer to "blue arrows"?
      2. Fig.1 b-c): it would be helpful to include which staining (silver/Ponceau?) was performed here
      3. Fig. 2a): I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.
      4. Fig. 4 c) bar graphs on the top: the X-values are missing.
      5. This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."
      6. In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation.

      The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      My expertise: I am a researcher working with Drosophila melanogaster and have published on the functions of the Tip60-p400 complex. I do not have extensive expertise in nucleosome arrays, the major method applied in this manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, Krause and colleagues investigate the intrinsic substrate selectivity of the four-subunit TIP60 core module from Drosophila melanogaster using synthetic nucleosome arrays. To quantitatively assess acetylation at individual lysines on histones H2A, the variant H2A.V, and H4, the authors employ targeted mass spectrometry, thereby overcoming the limitations of antibody-based approaches. Contrary to earlier reports, their results reveal that the TIP60 core complex displays a selective lysine acetylation pattern, with distinct kinetics toward specific residues on each histone tail. For example, H2A lysines K5, K8, and K10 were acetylated, with K10 exhibiting the highest modification levels. On H2A.V, K4 and K7 were modified, with K7 showing greater initial efficiency. For H4, K12 was identified as the primary target, and its acetylation was further enhanced in the presence of H2A.V. The study also examined the activity of the KAT5 inhibitor NU9056, uncovering variable inhibition across different acetylation sites. Overall, the authors conclude that intrinsic substrate selectivity is central to understanding the mechanism of Tip60 activity and that the presence of H2A variants can modulate both the efficiency and specificity of acetylation.

      Major comments:

      The study by Krause et al. examines the in vitro substrate selectivity of the Drosophila TIP60 core complex and the lysine-specific effects of the inhibitor NU9056. The authors use a defined in vitro system with recombinant proteins and nucleosome arrays, together with targeted mass spectrometry, to assess intrinsic enzyme activity while avoiding potential issues of antibody specificity and avidity. Heatmaps and bar plots derived from the MS data show site-specific acetylation patterns and the effects of the inhibitor. A comparative analysis with the MSL core complex, which has a well-characterized selectivity, is used as a reference point for interpreting the specificity of TIP60. The observation that NU9056 exhibits different levels of effectiveness on individual lysines, including residues within the same histone tail, is supported by the quantitative MS measurements. A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Minor comments:

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      • In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.
      • For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This study uses defined, reconstituted nucleosome arrays (H2A- or H2A.V-containing) and the four-subunit Drosophila TIP60 core complex to map intrinsic substrate selectivity across time courses and in the presence of reported TIP60 inhibitors (NU9056, TH1834). Key findings are: (i) selective H2A-tail acetylation (K10 > K8 > K5) with negligible K12/K14; (ii) preferential H2A.V K4 and K7 acetylation with distinct kinetics and low co-occurrence on a single tail; (iii) H4K12 is strongly favoured over other H4 sites; (iv) acetylation patterns are consistent with a more distributive (non-processive) mechanism relative to MOF/MSL; (v) NU9056 inhibits TIP60 activity with site-specific differences suggestive of a non-competitive/allosteric component, whereas TH1834 shows no effect in this Drosophila system.

      Major comments

      The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.
      2. The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.
      3. Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?
      4. Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Figure 2, it would be useful to have the matched data for H2A.
      5. The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells.
      6. You highlight that H2A K10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?
      7. To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.
      2. Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.
      3. Figure order/legends:

      a. Text references Fig.1E before Fig.1C, please reorder

      b. Fig.1B/C legend labels appear swapped.

      c. Fig.1E, 4A, 4B: add quantification

      d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used 4. Ensure consistent KAT5/TIP60 naming. 5. Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases.

      We thank the Reviewer for appreciating our work and for their valuable suggestions to improve our manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      In my opinion, a few aspects would improve the manuscript. Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative. Authors’ response. This point will be addressed as detailed in the Revision Plan

      If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      I also have a few minor points to highlight:

        • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.*

      Authors’ response. We thank the Reviewer for raising this point. We now indicated the statistical analyses performed on the data presented in the mentioned figures (according also to a point of Reviewer #3). According to the conclusion that Trim32 is necessary for proper regulation of c-Myc transcript stability, using 2-way-ANOVA, the data now reported as Figure 5G show the statistically significant effect of the genotype at 6h (right-hand graph) but not at D0 (left-hand graph). In the graphs of Fig. EV5 D and E at D0 no significant changes are observed whereas at 6h the data show significant difference at the 40 min time point. We included this info in the graphs and in the corresponding legends.

      - On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Authors’ response. As suggested, we included the graph showing the differentiation index upon c-Myc silencing in the Trim32 KO clones and in the WT clones, as a novel panel in Figure 6 (Fig. 6D). As already reported in the text, a partial recovery of differentiation index is observed but the increase is not statistically significant. In contrast, no changes are observed applying the same silencing in the WT cells. Legend and text were modified accordingly.

      Reviewer #1 (Significance (Required)):

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. * * At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking. Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8. * * The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: * * In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      We thank the Reviewer for valuing our work and for their appreciated suggestions to improve our manuscript. We have carefully addressed some of the concerns raised as detailed here, while others, which require more laborious experimental efforts, will be addressed as reported in the Revision Plan.

      Major Comments:

      The work is a bit incremental based on this:

      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 * * And this:

      https://www.nature.com/articles/s41418-018-0129-0 * * To their credit, the authors do cite the above papers.

      Authors’ response. We thank the Reviewer for this careful evaluation of our work against the current literature and for recognising the contribution of our findings to the understanding of myogenesis complex picture in which the involvement of Trim32 and c-Myc, and of the Trim32-c-Myc axis, can occur at several stages and likely in narrow time windows along the process, thus possibly explaining some reports inconsistencies.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data.

      Authors’ response. We agree with the Reviewer and we modified our phrasing that implied Trim32-c-Myc axis as the exclusive mechanism by explicitly indicated that other pathways contribute to guarantee proper myogenesis, in the Abstract and in Discussion.

      The Abstract now reads: … suggesting that the Trim32–c-Myc axis may represent an essential hub, although likely not the exclusive molecular mechanism, in muscle regeneration within LGMDR8 pathogenesis.”

      The Discussion now reads: “Functionally, we demonstrated that c-Myc contributes to the impaired myogenesis observed in Trim32 KO clones, although this is clearly not the only factor involved in the Trim32-mediated myogenic network; realistically other molecular mechanisms can participate in this process as also suggested by our transcriptomic results.”

      The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      Authors’ response. We thank the Reviewer for appreciating our thorough analyses on cell cycle dynamics in proliferation conditions and at the onset of the differentiation process.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      Authors’ response. We thank the Reviewer for raising this point and apologise for being too brief in describing the data, leaving indeed some points excessively implicit. As suggested, we now reorganised this session and added the lists of enriched canonical pathways relative to WT vs KO comparisons at D0 and D3 (Fig. EV3B) as well as those relative to the comparison between D0 and D3 for both WT and Trim32 KO samples (Fig. EV3C), with their relative scores. We changed the Results section “Transcriptomic analysis of Trim32 WT and Trim32 KO clones along early differentiationas reported here below and modified the legends accordingly.

      The paragraph now reads: Based on our initial observations, the absence of Trim32 already exerts a significant impact by day 3 (D3) of C2C12 myogenic differentiation. To investigate how Trim32 influences early global transcriptional changes during the proliferative phase (D0) and early differentiation (D3), we performed an unbiased transcriptomic profiling of WT and Trim32 KO clones (Fig. 2A). Multidimensional Scaling (MDS) analysis revealed clear segregation of gene expression profiles based on both time of differentiation (Dim1, 44% variance) and Trim32 genotype (Dim2, 16% variance) (Fig. 2A). Likewise, hierarchical clustering grouped WT and Trim32 KO clones into distinct clusters at both timepoints, indicating consistent genotype-specific transcriptional differences (Fig. EV3A). Differentially Expressed Genes (DEGs) were detected in the Trim32 KO transcriptome relative to WT, at both D0 and D3. In proliferating conditions, 72 genes were upregulated and 189 were downregulated whereas at D3 of differentiation, 72 genes were upregulated and 212 were downregulated. Ingenuity Pathway Analysis of the DEGs revealed the top 10 Canonical Pathways displayed in Fig. EV3B as enriched at either D0 or D3 (Fig. EV3B). Several of these pathways can underscore relevant Trim32-mediated functions though most of them represent generic functions not immediately attributable to the observed myogenesis defects.

      Notably, the transcriptional divergence between WT and Trim32 KO cells is more pronounced at D3, as evidenced by a greater separation along the MSD Dim2 axis, suggesting that Trim32-dependent transcriptional regulation intensifies during early differentiation (Fig. 2A). Given our interest in the differentiation process, we therefore focused our analyses comparing the changes occurring from D0 to D3 in WT (WT D3 vs. D0) and in Trim32 KO (KO D3 vs. D0) RNAseq data.

      Pathway enrichment analysis of D3 vs. D0 DEGs allowed the selection of the top-scored pathways for both WT and Trim32 KO data. We obtained 18 top-scored pathways enriched in each genotype (-log(p-value) ³ 9 cut-off): 14 are shared while 4 are top-ranked only in WT and 4 only in Trim32 KO (Fig. EV3C). For the following analyses, we employed thus a total of 22 distinct pathways and to better mine those relevant in the passage from the proliferation stage to the early differentiation one and that are affected by the lack of Trim32, we built a bubble plot comparing side-by-side the scores and enrichment of the 22 selected top-scored pathways above in WT and Trim32 KO (Fig. 2B). A heatmap of DEGs included within these selected pathways confirms the clustering of the samples considering both the genotypes and the timepoints highlighting gene expression differences (Fig. 2C). These pathways are mainly related to muscle development, cell cycle regulation, genome stability maintenance and few other metabolic cascades.

      As expected given the results related to Figure 1, moving from D0 to D3 WT clones showed robust upregulation of key transcripts associated with the Inactive Sarcomere Protein Complex, a category encompassing most genes in the “Striated Muscle Contraction” pathway, while in Trim32 KO clones this pathway was not among those enriched in the transition from D0 to D3 (Fig. EV3C). Detailed analyses of transcripts enclosed within this pathway revealed that on the transition from proliferation to differentiation, WT clones show upregulation of several Myosin Heavy Chain isoforms (e.g., MYH3, MYH6, MYH8), α-Actin 1 (ACTA1), α-Actinin 2 (ACTN2), Desmin (DES), Tropomodulin 1 (TMOD1), and Titin (TTN), a pattern consistent with previous reports, while these same transcripts were either non-detected or only modestly upregulated in Trim32 KO clones at D3 (Fig. 2D). This genotype-specific disparity was further confirmed by gene set enrichment barcode plots, which demonstrated significant enrichment of these muscle-related transcripts in WT cells (FDR_UP = 0.0062), but not in Trim32 KO cells (FDR_UP = 0.24) (Fig. EV3D). These findings support an early transcriptional basis for the impaired myogenesis previously observed in Trim32 KO cells.

      In addition to differences in muscle-specific gene expression, we observed that also several pathways related to cell proliferation and cell cycle regulation were more enriched in Trim32 KO cells compared to WT. This suggests that altered cell proliferation may contribute to the distinct differentiation behavior observed in Trim32 KO versus WT (Fig. 2B). Given that cell cycle exit is a critical prerequisite for the onset of myogenic differentiation and considering that previous studies on Trim32 role in cell cycle regulation have reported inconsistent findings, we further examined cell cycle dynamics under our experimental conditions to clarify Trim32 contribution to this process

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Authors’ response. We thank the Reviewer for bringing to our attention these two publications, that indeed, add important piece of data to recapitulate the in vivo complexity of c-Myc role in myogenesis. We included this point in our Discussion.

      The Discussion now reads: “On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025). Other reports, instead, demonstrated the implication of c-Myc periodic pulses, mimicking resistance-exercise, in muscle growth, a role that cannot though be observed in our experimental model (Edman et al., 2024; Jones et al., 2025).”

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Authors’ response. As suggested, we modified the z-score-representing colors using a more distinct gradient especially in the positive to negative transition in Figure 2B.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      Authors’ response. As now better explained (see comment regarding Major point: Transcriptomics), we used a cut-off of -log(p-value) above or equal to 9 for pathways enriched in DEGs of the D0 vs D3 comparison for both WT and Trim32 KO. The threshold is now included in the Results section and the pathways (shared between WT and Trim32 KO and unique) are listed as Fig. EV3C.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Authors’ response. We thank the Reviewer for this remark, and we apologise for having overlooked it. We amended this throughout the manuscript by always using for clarity “Trim32 KO clones/cells”.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Authors’ response. We agree with the Reviewer that C2C12 passaging can reduce the differentiation potential of this myoblast cell lines; this is indeed the main reason why we decided to employ WT clones, which underwent the same editing process as those that resulted mutated in the Trim32 gene, as reference controls throughout our study. We apologise for not indicating the passages in the first version of the manuscript that now is amended as per here below in the Methods section:

      The C2C12 parental cells used in this study were maintained within passages 3–8. All clonal cell lines (see below) were utilized within 10 passages following gene editing. In all experiments, WT and Trim32 KO clones of comparable passage numbers were used to ensure consistency and minimize passage-related variability.

      Reviewer #2 (Significance (Required)):

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation. * * Advance: * * To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting. * * Audience: * * This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise: * * My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:*

      We thank the Reviewer for the in-depth assessment of our work and precious suggestions to improve the manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      - TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR

      Authors’ response. We thank the reviewer for this suggestion. This point will be addressed as detailed in the Revision Plan. We have selected several transcripts that will be evaluated in independent samples in order to validate the RNAseq results.

      - The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates.

      Authors’ response. We thank the Reviewer for raising this point.

      Regarding the replicates, we clarified in the Methods and Legends that the Trim32 KO experiments have been performed on 3 biological replicates (independent clones) and the same for the reference control (3 independent WT clones), except for the Fig. 6 experiments that were performed on 2 Trim32 KO and 2 WT clones. All the Western Blots, immunofluorescence, qPCR data are representative of the results of at least 3 independent experiments unless otherwise stated. We reported the number and type of replicates as well as the microscope fields analyzed.

      We repeated the statistical analyses of the data in Figure 5G, EV5D, EV5E, employing more appropriately the 2-way-ANOVA test, as suggested, and we now reported this info in the graphs and legends.

      We thank the Reviewer for raising this point, we agree and substituted the graphs in Fig. EV5B and 6B showing the control values normalised as suggested. The statistical analyses now reflect this change.

      -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)."

      Authors’ response. We re-edited this revised version of the manuscript as suggested.

      -Results in Figure 5A should be quantified

      Authors’ response. We amended this point by quantifying the results shown in Fig. 5A, we added the graph of the quantification of 3 experimental replicates to the Figure. Quantification confirms that no statistically significant difference is observed. The Figure and the relative legend are modified accordingly.

      -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D

      Authors’ response. We agree with the Reviewer that the presence of p84 also in the cytoplasmic fraction is not ideal. Regrettably, we observed this faint p84 band in all the experiments performed. We think however, that this is not impacting on the result that clearly shows that c-Myc and Trim32 are never detected in the same compartment.

      -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition.

      Authors’ response. We agree with the Reviewer and we now show the graph of the results of the 3 technical replicates for 2 biological replicates and do not indicate any statistics (Fig. 6B). The graph was also modified according to a previous point raised.

      -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?

      Authors’ response. We agree with the Reviewer that Trim32 might also be necessary for myoblast fusion. This point is however beyond the scope of the present study and will be addressed in future work.

      - The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Reviewer #3 (Significance (Required)):

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:

      • TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.
      • The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim. -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR
      • The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.
      • There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates. -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)." -Results in Figure 5A should be quantified -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition. -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?
      • The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Significance

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      Major Comments:

      The work is a bit incremental based on this: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 And this: https://www.nature.com/articles/s41418-018-0129-0 To their credit, the authors do cite the above papers.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data. The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis. Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Significance

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation.

      Advance:

      To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting.

      Audience:

      This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise:

      My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases. In my opinion, a few aspects would improve the manuscript.

      Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative.
      2. If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation. An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      I also have a few minor points to highlight:

      • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.
      • On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Significance

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking.

      Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8.

      The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Shukla et al described the "chromatin states" in the bryophyte Marchantia polymorpha and compared it with that in Arabidopsis thaliana. They described the generally common features of chromatin states between these evolutionally distant plant species, but they also find some differences. The authors also studied the connection between chromatin states and TF bindings, mostly in Arabidopsis due to the scarcity of the TF binding data in Marchantia. Their analyses lead to interesting finding that specific transcription families tend to associate with specific chromatin state, which tend to associate with specific genomic regions such as promoter, TSS, gene body, and fucultative heterochromatin. Overall, the authors provide novel piece of information regarding the evolutional conservation of chromatin states and the relationship between chromatin states and TFs.

      Major comments:

      1. In the end of the abstract they state "The association with the +1 nucleosome defines a list of candidate pioneer factors we know little about in plants", which is one of their major points. This is based on the results Fig4F and 4G, described in P27 L16-17. Question is, is cluster 1 TFs really associated with the +1 nucleosome? From Fig. 1C, +1 nucleosome is characterized mostly by E1 state and also by E2, F3, F4. However, from Fig. 4F, cluster 1 TFs are not associated with E1/E2 and association is not particularly strong for F3/F4. Indeeed association with E1/E2 is much conspicuous for cluster 4 TFs. Therefore, authors should reconsider this point and consider rephrasing or showing further results of analyses.

      2. P17 last line to P18, they state "The facultative heterochromatin states were primarily associated with the intergenic states I1 to I3, based on their enrichment in H3K27me3 and H2AK121ub, low accessibility, and low gene expression". I'm not sure about this statement. How can they say "primarily associated" from the data they cite? As far as the PTMs and variants patterns, I1 to I3 and facultative heterochromatin look different. The authors should explain more or rephrase.

      3. P20 L15, the authors state "Contrary to Arabidopsis, the promoters of Marchantia defined by the region just upstream of the TSS showed enrichment of H2AUb and the elongation mark H3K36me3, along with other euchromatic marks. " I have a concern that the TSS annotation could be inaccurate in Marchantia compared to more rigorously tested annotation of Arabidopsis thaliana, so that the relationship between TSS and histone PTMs could be different between species. The authors should make sure this is not the case.

      4. P21 last line to P22, they analyzed only H3K27me3 and H2Aub in the mutants of E(z) (Fig. 2E) and states that "we analyzed chromatin landscape in the Marchantia...". Is analyzing two histone marks enough to say "chromatin landscape"? In addition, they state "These findings suggest a strong independence of the two Polycomb repressive pathways in Marchantia. " However, they did not analyzed the effect of loss of PRC1 on H3K27me3; the opposite way. Actually, in Arabidopsis loss of PRC1 causes loss of H2Aub AND H3K27me3 (Zhou et al (2017) Genome Biol: DOI 10.1186/s13059-017-1197-z).

      5. Related to the above comments, they states "To further compare the regulation by PRC2 in both species,". However, they did not describe the knowledge about regulation by PRC2 in Arabidopsis. They should consider describing.

      6. P25 L14: "With this method to estimate TF activity, the scores of TF occupancy and activity converged. To look at different patterns of chromatin preferences among TFs, we kept ChIP-seq and DAP-seq data for ~300 TFs in Arabidopsis (after filtering out TFs with low scores of occupancy and activity)." This part is a little hard to follow. Perhaps better to explain in more detail.

      7. In discussion section P30 L19-21: "This could be due to open chromatin, which is associated with highly expressed genes and permissive for TF binding, generating highly occupied target regions (HOT) with redundant or passive activity (19)." This part needs further explanation; espetially for the latter part, It's not clar what the authors claim.

      Minor comments:

      1. P17 L21: H2bUb should be H2Bub.

      2. Legend of Fig. 4D: later should be latter.

      3. Legend of Fig. 4G and H: "clusters defined in figure-H" should be "defined in Fig. 4F"?

      Referee cross-commenting

      Reviewer #1 raises thorough and important points that should be addressed before the manuscript is published. Particularly about the comparison of chromatin states between Arabidopsis and Marchantia, as this paper will make foundation for further research in the future and serve as a resource for community, the authors should thoroughly look into the points raised by reviewer #1 including annotation of transcriptional units.

      Significance

      Strength and limitation: Strength of this paper is the insights into chromatin-based transcriptional regulation by defining chromatin states using combination of many epigenome data and compare it with TF biding data. Limitation is lack of experimental support for their interesting claims by perturbing histone PTMs, for example. Also, a limitation is that comparing only two species can tell subjective "similar" or "different" between species.

      Advance comparing past literature: One clear advance is studying chromatin states in a plant other than Arabidopsis thaliana. Another one is revealing that TFs can be classified into a number of groups according to the relationships with chromatin-based transcription regulation. However, experimental tests for these are awaited.

      Audience: Epigenetics, chromatin, and transcription researchers, plant biologists interested in transcriptional regulation.

      My expertise: Epigenome, genetics, histone PTMs, plants

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors characterize chromatin states in the flowering plant Arabidopsis thaliana and the bryophyte Marchantia polymorpha. Here, they draw from ChIP-seq data that was previously published, and from data generated as part of this study, in particular for Marchantia H2.A variants (H2A.X.1, H2A.X.1, H2A.Z, H2A.M.2). The authors compute chromatin states, which enables a comparison over more than 450 million years of land plant evolution. While comparisons of plant chromatin to other species highlighted conservation as well as differences, this study targets a knowledge gap of evaluating chromatin conservation during land plant evolution. The authors investigate a connection between Transcription Factors binding sites and chromatin states. They propose a list of candidate pioneer factors associating with the +1 nucleosome.

      Major comments:

      • For the Association of chromatin states with expression, the authors use the TAIR10 annotation for extracting TSSs and promoter sequences. When investigated, a comparison of data resolving TSS with this annotation (or Araport11) shows a pretty poor overlap between the TSS based on Tair10/Araport11 and experimentally derived TSSs. This information was captured in Arabidopsis genome annotation files where the experimental TSS matches the genome annotation. What is the advantage of using an annotation with the inaccurate TSSs in TAIR10? It seems to confound the study.

      • The TSS annotation in Marchantia polymorpha (Tak1 v7.1) may also match poorly to the experimentally derived TSS. I suggest that the authors generate data to detect TSS in their tissue of choice and compare the positions to the genome annotation they use (f.x. PMID: 38831668).

      • I am not convinced that it is a wise choice to utilize fewer ChIP-seq data in Marchantia than Arabidopsis. Can the missing Marchantia ChIP-seq experiments not be performed and included to complete the comparison?

      • P. 26 onwards, the authors investigate different TF clusters and their association with chromatin states. They state "cluster 1 TFs primarily associated with the first nucleosome downstream of the TSS". However, if the gene is not really expressed in these "leave" tissues, then how can the authors be sure that the same TSS position would be used in "flower" tissue? It could be an artifact of a genome annotation file that misses flower-tissue TSS data. It is not an obvious to conclusion to name these factors "pioneer TFs". Experiments testing this are missing as far as I can gather.

      Minor comments:

      • Can the authors add files ( e.g. .bed) with their segmented chromatin states as part of their GEO submission? That could improve the impact and make the findings more accessible.

      • Can the authors rule out issues with the Marchantia annotation, for example missing read-through transcription or alternative isoforms, that would essentially have the effect that the genomic segmentation they use contains elongating upstream transcripts in from of promoter TSS? This could be an alternative explanation for the enrichment of H2AUb/H3K36me3 just upstream of the TSSs as they describe on p.21. If it can´t be ruled, the limitations from genome annotations, and examples offering improvements could be highlighted in the discussion. This may also be supported by the long persistence of E4 after the TTS p.23.

      • P.23 - This further suggests that in Marchantia, the orientation of genes defines

      • distinct chromatin environment in their vicinity, through mechanisms yet to be uncovered. Does this correlate with the distance of the closest (annotated) transcript pairs?

      • The E1 state highlighted on p.24 and in Fig.3A/d is not annotated in Fig.3A/D. It is also not clear in the legends which number it is.

      • P.30 - The marks H3K4me1 and H3K36me3 reflecting transcriptional elongation and confined to the gene bodies in Arabidopsis, extend beyond the TTS in Marchantia, suggesting that signals for transcriptional termination differ between flowering plants and bryophytes. There are multiple alternative explanations. Likely a combination of missing transcripts in their genome annotation (e.g. lncRNAs), annotation errors (e.g. wrong ends) and the segmentation of these regions (e.g. the transcripts are closer than in Arabidopsis). The discussion could extended significantly to address these issues and include the efforts to improve the genome annotations.

      Referee cross-commenting

      Reviewer #2 raises fair and valuable questions.

      Significance

      Significance: The authors corroborate prior chromatin state analyses in Arabidopsis and provide a chromatin state analysis for Marchantia. These data represent a resource that will be used and appreciated by the plant and ChromEvoDevo communities. The quality of the analyses are high and the description is transparent. I am not aware of a similar study comparing bryophytes and a land plant, so this study addresses a gap in knowledge.

      General assessment: The quality of the manuscript is high. The analyses are described well, and in sufficient detail to be understood. The effort going into documentation is high, I rate the study as reproducible. The linked github deposition looks good. The data generated as part of this study is available in the linked GEO deposition. An experimental design of 2 biological repeats is used, which is OK, but the lower limit. The GEO-deposited .bw files should be of interest to the ChromEvoDevo community, and researchers interested in Marchantia epigenetics and gene expression. The manuscript is written clearly and to the point. The figures condense a lot of data and match the text. The figures are rather complex and not easily accessible to someone browsing through a journal issue. However, that is fine for these types of papers. The manuscript is strong on data analysis. Other approaches, for example mutants to validate their hypothesis, are not utilized. The calculation of chromatin states offers a way to condense complex information into simpler terms. Nevertheless, it re-organizes information that largely existed before. To me, the biggest value of this study appears to be to regard it as a resource that calculated the chromatin states in a comparable fashion between organisms.

      Advance: The manuscript provides several advances. It provides new ChIP-seq data for Marchantia, it generates a chromatin state map for Marchantia, it compares Chromatin state maps between distant evolutionary time, and it generates a new hypothesis regarding pioneer TFs in plants. Some of the points described in the article hold true for even larger evolutionary distances, for example comparing plants to yeast and metazoans. The manuscript fills a knowledge gap and has offers a comparison via the computation of comparable chromatin states.

      Audience: The audience will be colleagues interested in chromatin and epigenetics, the Marchantia and plant communities as well as researchers interested in EvoDevo of chromatin organization. Even though the study uses plant models, it is highly relevant for non-plant models.

  2. Oct 2025
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In the manuscript "Nucleosome positioning shapes cryptic antisense transcription", Kok and colleagues perform a characterization of nucleosome remodeling factors in S. pombe by assaying the impact of their deletion on antisense transcription and nucleosome organization. They find that deletion of Hrp3 leads to up-regulation of antisense RNA transcripts as well as disruption of phased nucleosomes in gene bodies. The authors then establish a catalogue of antisense transcripts in S. pombe using long read RNA sequencing, which they use to analyze the relationship between nucleosome positioning and antisense transcription. Through this analysis, they associate nucleosome positioning with the initiation of antisense transcription and conclude that nucleosome positioning within gene bodies represses cryptic antisense transcription. They further support this observation by showing that the up-regulated genes in the Hrp3 knock-out are enriched for genes usually expressed in meiosis, which in S. pombe often occur as nested transcripts in reverse orientation. Using growth assays under various stress conditions, the authors narrow down the domain responsible for the phenotype to the C-terminal CHCT domain. To address how Hrp3 gains specificity, they perform an in-silico interaction prediction screen to identify Prf1 as a putative interactor of the CHCT domain. Using recombinant expression in bacteria followed by pulldowns from lysates, they confirm the interaction and introduce point mutants that abolish the interaction. The authors then link the interaction with Prf1 to transcriptional elongation, where they observe a correlation between Hrp3 presence and chromatin marks of transcription elongation, especially H2BK119ub, which is also reduced in the Hrp3 knockout. They further demonstrate that both gene body nucleosome phasing and antisense transcription are similarly affected in the prf1 knockout as well as the hrp1-hrp3-prf1 triple knock-out cells, which indicates that they affect the same pathway.

      Major comments:

      The manuscript is well-written and the claims are generally supported by the data. The authors demonstrate scientific rigor through comprehensive experiments using single and double knockouts. I have three main comments that can be addressed through additional analysis and limited experimentation:

      1. The authors use the terms "Prf1" and "Paf1 complex" interchangeably multiple times in the manuscript (eg. Line 296). However, the experimental data presented only demonstrate a connection between Prf1 and Hrp3. Furthermore, published literature establishes that Prf1 and Paf1 represent distinct entities in S. pombe (Mbogning et al., 2013, PLoS Genetics 9(3): e1004029). The authors should clarify this distinction and use consistent, accurate terminology throughout the text. Reference: Mbogning, J., et al. (2013). The PAF Complex and Prf1/Rtf1 Delineate Distinct Cdk9-Dependent Pathways Regulating Transcription Elongation in Fission Yeast. PLoS Genetics, 9(3), e1004029. https://doi.org/10.1371/journal.pgen.1004029

      2. The authors demonstrate that Hrp3 limits antisense promoter usage; however, the analysis lacks characterization of sequence composition, promoter classes (TATA-box versus TATA-less), or identification of enriched transcription factor motifs near these sites. A more thorough bioinformatic analysis would strengthen the paper and potentially reveal interesting biology, as the effect may be specific to certain transcription factors or promoter architectures.

      3. The Hrp3-Prf1 interaction is demonstrated solely through recombinant overexpression and pulldown assays, which carries the risk of detecting non-physiological interactions. While the authors use mutations to verify pulldown specificity, in vivo evidence for this interaction is absent. Given that the authors cite a recent preprint demonstrating sophisticated techniques to show S. cerevisiae Chd1-Prf1 interactions, I presume standard approaches such as co-immunoprecipitation followed by mass spectrometry or Western blot were attempted. Even negative results from such experiments should be reported, as readers will likely question the physiological relevance of the interaction. Additionally, establishing the hierarchy between Hrp3, Prf1, and H2BK119Ub is crucial. While the authors show that Hrp3 ChIP-seq signal correlates with gene expression levels, the proposed Prf1-Hrp3 interaction raises questions about recruitment specificity and hierarchy. The authors mention in lines 344-345: "...the CHCT domain of Hrp3 is critical for its association with transcription elongation along the gene body..." which requires support from experimental data. Testing Hrp3 ChIP-seq in Prf1-depleted conditions would clarify how specificity is achieved and substantiate the functional importance of this interaction. As the authors have all the required strains I would estimate around 1.5-2 months for data generation and analysis.

      4. [Optional] Based on strucutre predictions the authors suggest that the interaction of of CHD1 and RTF1 is conserved in arabidopsis and mouse. This should be further supported by pulldown assays and also the pre-print (Reference nr. 99) should be cited as they show similar results using yeast-tow-hybrid assays

      Minor comments:

      1. Figure 1B: Grouping individual panels according to different paralog groups would make the figure more accessible.

      2. Figure 1D: The display of antisense transcription is not accessible. Perhaps boxplots, like those in Figures 2B and 5D, would be easier to read.

      3. Line 335: The transition is abrupt and would benefit from additional explanation. Why do the authors use Rtf1 instead of Prf1 here? Consistent nomenclature would improve clarity.

      4. Line 352: For the phrase "significant loss," please provide a statistical test or omit the word "significant."

      5. Figure 7F: The model presented in panel F suggests that there are two parallel routes that lead to nucleosome phasing; however, the authors state in the text (lines 363-364): "further supporting the idea that Hrp3 and Prf1 act together in the same pathway to control antisense transcription." The model and the text should align better.

      Significance

      • In the study, the authors establish Hrp3, one of the fission yeast CHD1 remodelers, as a crucial regulator of antisense transcription within gene bodies, which they link to both fitness penalties and the regulation of genes typically expressed during meiosis. They further link the recruitment of Hrp3 at gene bodies to transcriptional elongation, which provides an interesting model for how antisense transcription is prevented in actively transcribed regions of the genome.

      • The study is overall very well executed and controlled and provides strong evidence for connecting Hrp3 with the repression of antisense transcription using adequate experiments and technologies. This provides novel insights into a widespread phenomenon present in many organisms. A point that needs further improvement is the suggested physical link between Hrp3 and Prf1. Despite potentially being challenging to address using molecular biology techniques, the authors can further improve the study by dissecting the genetic hierarchy of Hrp3 and Prf1 using accessible tools. This study will be of interest to a broad audience in basic research as it addresses the broad question of how antisense transcription is repressed and provides mechanistic insights into this process. Consequently, this study will be relevant for the broader field of transcriptional regulation and could provide entry points for studying the role of CHD remodelers in other organisms.

      • Field of expertise: chromatin biology, small RNA mediated heterochromatin formation

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Kok et al. report on the role of the chromatin remodelers Hrp1 and Hrp3 in maintaining nucleosome positioning and preventing antisense transcription in Schizosaccharomyces pombe. As commented below, the main criticism of the manuscript is that the first half describes results that are very similar to those already reported by several other laboratories. Therefore, the main novel aspect of the work is the interaction between Hrp3 and the Prf1 subunit of the PAF complex.

      Specific points:

      1. The articles of Hennig et al. (2012), Pointner et al. (2012) and Shim et al. (2012) are cited in the manuscript (line 119, Refs. 61-63) only as a confirmation of the minor effect of the absence of Hrp1 on nucleosome positioning and antisense expression. However, these three articles reached the same conclusion as Kok et al. that the absence of Hrp3 in S. pombe causes severe, genome-wide loss of nucleosome positioning and overexpression of antisense transcripts, whereas the absence of Hrp1 has a much weaker effect. These results were also discussed in a short review article (Touat-Todeschini et al. EMBO J. 2012. 31: 4371). Although Kok et al. analysed transcription at a higher resolution and mapped transcription initiation using Pro-Seq (Figures 1, 2 and 3), their results do not add much to what was already reported in these previous studies.

      2. Several sites in the manuscript state that Hrp3 belongs to the SWI/SNF family of chromatin remodelers (for example, line 92). However, Hrp3 is a member of the CHD family, whose members have a very different structure and function (see, for example, Clapier et al. 2017. Nat Rev Mol Cell Biol 18: 407; Paliwal et al. 2024 TIGs 41:236).

      3. The authors should indicate where the nucleosome remodelling activity of some of the proteins in Figure 1A like Irc20, Rrp1, Rrp2 and Mot1) has been reported.

      4. The analysis of nucleosome positioning by aggregating thousands of genes, such as those shown in Figure 1B, has low resolution and can only detect gross alterations affecting many genes. Nevertheless, several mutants, such as swr1∆ and rrp1∆, also exhibit altered nucleosomal profiles in Figure 1B. In other cases, the occupancy of the first and second nucleosomes after the TSS is reduced relative to the wild type. Therefore, it cannot be concluded that "nucleosome arrays in wild type and most remodeller mutant cells were highly ordered and regular" (line 105).

      5. Although it was previously reported that hrp3∆ mutants overexpress antisense transcripts (see point 1 above), it is unclear how this finding is represented in Figure 1D. Similarly, it not clear either why antisense transcription is undetectable in hrp1∆ relative to WT in Figure 1D, yet significantly higher than in WT in Figures 2B, 3A and 3B. Furthermore, sense transcription in the single and double mutants is comparable to WT in Figure 2A, yet much higher in Figure S3B.

      6. Figure S3C claims that antisense transcription is higher in genes with greater nucleosome disruption in the double mutant hrp1∆hrp3∆. However, without a quantitative analysis, it is difficult to discern any significant differences in the degree of disruption across the four quartiles of antisense expression.

      7. Figures 3D and S4C show that the TSS of antisense transcription colocalizes with a region resistant to MNase that is at least 300 bp wide. This size does not correspond to that occupied by a nucleosome and contrasts with the expected size of the four nucleosome peaks downstream from it.

      8. In relation to the previous point, Figure S4C (bottom) shows that the centre of the region above the TSS is slightly displaced in the three mutants. This displacement corresponds to an increase in the G+C content of approximately 1.5% (Figure S4C top), equivalent to an increase of less than 2.5 Gs and Cs every 150 bp of nucleosomal DNA. Without some cause and effect experiments, it is difficult to attribute a functional significance to such a tiny difference. How repetitive is this difference in biological replicates?

      9. The authors should also explain how the position of the dyads was estimated in the double mutant hrp1∆hrp3∆ in Figure S4B. The severe loss of nucleosomal positioning suggests that the dyads occupy different positions in different cells within the same population. While most of the remaining figures show data for the three mutants, this figure shows results for the double hrp1∆hrp3∆ mutant only.

      10. Figures 3G and 3H show the analysis of the promoter activity of some regions upstream from antisense transcripts, achieved by replacing the endogenous ura4 gene promoter with these regions. This analysis lacks negative controls showing the level of transcription in the recipient strain following the removal of the endogenous ura4 promoter and its replacement for genomic regions not associated with the initiation of antisense transcription in the mutants. Furthermore, transcription should be measured by quantitative PCR of the ura4 mRNA rather than by the more indirect method of measuring OD600 in 384-well plates (line 708).

      11. Figure F4 suggests that Hrp3 may regulate the expression of genes specific to meiosis by showing an anticorrelation between the expression levels of Hrp3 and a selection of genes that are upregulated during meiosis (MUGs) 5 hours after the onset of meiosis. While this is an interesting possibility, it will remain speculative until it is demonstrated that the level of Hrp3 protein is reduced at the same stage of meiosis, and that MUG overexpression is associated with reduced nucleosomal occupancy adjacent to their TSS at that stage.

      12. The experiments in Figures 5 and 6, which describe the interaction between the Hpr3-specific CHCT domain and the Prf1 protein, are interesting and represent the main element of novelty of the manuscript. However, this interaction in figure 6D and 6E should be confirmed in vivo.

      13. Kok et al. indicate that the triple prf1∆ hrp1∆ hrp3∆ mutant exhibits stronger growth defects than the single prf1∆ mutant. However, Figure S9F shows that no growth is detectable in the single prf1∆ mutant, a phenotype that cannot be exacerbated in the triple mutant. Perhaps the use of a prf1 mutant showing a less severe phenotype migh help.

      Significance

      As indicated in point 1, the first half of the manuscript describes results that are very similar to those already reported in the literature.

      The interaction between Hrp3 and the Prf1 subunit is new and interesting, and could lead to further research and a new manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an excellent study that leverages the chromatin biology of Schizosaccharomyces pombe to uncover the central role of CHD1-family remodelers in maintaining nucleosome organisation and suppressing cryptic transcription. The work is carefully executed. In short, the authors show that Hrp3 is the primary CHD1-family remodeler responsible for maintaining nucleosome organisation over gene bodies. This represses antisense transcription from cryptic promoters in gene bodies. They provide evidence that Hrp3 is repressed in meiosis to allow the induction of meiotic genes. They further identified that a conserved domain, the CHCT domain of Hrp3, is essential for its interaction with Prf1 (PAF complex subunit), which is critical for the chromatin organisation in gene bodies. This manuscript is of excellent quality and is an important contribution towards understanding how transcription initiation is repressed within gene bodies. I have small comments and suggestions for clarification.

      Minor comments:

      • The study demonstrates that Hrp3 represses antisense transcription at meiotic genes, showing that Hrp3 is reduced in meiosis, which could facilitate the induction of meiotic genes. Is there a phenotype in the hrp3Δ or the hrp1Δ hrp3Δ mutant in relation to meiosis? E.g. do these strains enter meiosis uncontrolled?

      • Figure 3C - ORC4 Locus TSS presentation. The presented data do not show a well-defined TSS on the sense strand. For reference, it would be useful to show that sense TSS is not altered between the different strains.

      • The study focuses on antisense cryptic transcription, which is relatively easy to measure by RNA-seq. Often, however, cryptic transcription can also occur in the sense direction in gene bodies. Do the authors also find evidence of cryptic sense transcription in gene bodies (based on TSS-seq data)? This could be useful for completeness to report, as this could lead to aberrant protein-coding isoforms.

      • The manuscript alternates between "Prf1" (S. pombe) and "RTF1" (other eukaryotes). This is at times confusing. I recommend consistent use of gene nomenclature.

      • The authors show epistatic interaction for nucleosome spacing in Figure 7D for the prf1Δ and hrp1Δ hrp3Δ prf1Δ strains. It would be informative to have the hrp1Δ hrp3Δ data also included in Figure 7D, like in the other figure panels.

      Significance

      This is an excellent study that leverages the chromatin biology of Schizosaccharomyces pombe to uncover the central role of CHD1-family remodelers in maintaining nucleosome organisation and suppressing cryptic transcription. This manuscript is of excellent quality and is an important contribution towards understanding how transcription initiation is repressed within gene bodies.

      I am an expert on transcription regulation and noncoding transcription.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We would like to thank the referees for their time and effort in giving feedback on our work, and their overall positive attitude towards the manuscript. Most of the referees' points were of clarifying and textual nature. We have identified three points which we think require more attention in the form of additional analyses, simulations or significant textual changes:

      Within the manuscript we state that conserved non coding sequences (CNSs) are a proxy for cis regulatory elements (CREs). We proceed to use these terms interchangeably without explaining the underlying assumption, which is inaccurate. To improve on this point we ensured in the new text that we are explicit about when we mean CNS or CRE. Secondly, we added a section to the discussion (‘Limitations of CNSs as CREs’) dedicated to this topic. During stabilising selection (maintaining the target phenotype) DSD can occur fully neutrally, or through the evolution of either mutational or developmental robustness. We describe the evolutionary trajectories of our simulations as neutral once fitness mostly plateaued; however, as reviewer 3 points out, small gains in median fitness still occur, indicating that either development becomes more robust to noisy gene expression and tissue variation, and/or the GRNs become more robust to mutations. To discern between fully neutral evolution where the fitness distribution of the population does not change, and the higher-order emergence of robustness, we performed additional analysis of the given results. Preliminary results showed that many (near-)neutral mutations affect the mutational robustness and developmental robustness, both positively and negatively. To investigate this further we will run an additional set of simulations without developmental stochasticity, which will take about a week. These simulations should allow us to more closely examine the role of stabilising selection (of developmental robustness) in DSD by removing the need to evolve developmental robustness. Additionally, we will set up simulations in which we changed the total number of genes, and the number of genes under selection to investigate how this modelling choice influences DSD. In the section on rewiring (‘Network redundancy creates space for rewiring’) we will analyse the mechanism allowing for rewiring in more depth, especially in the light of gene duplications and redundancy. We will extend this section with an additional analysis aimed to highlight how and when rewiring is facilitated. We will describe the planned and incorporated revisions in detail below; we believe these have led to a greatly improved manuscript.

      Kind regards,

      Pjotr van der Jagt, Steven Oud and Renske Vroomans

      Description of the planned revisions

      Referee cross commenting (Reviewer 4)

      Reviewer 3's concern about DSD resulting from stabilising selection for robustness is something I missed -- this is important and should be addressed.

      We understand this concern, and agree that we should be more thorough in our analysis of DSD by assessing the higher-order effects of stabilising selection on mutational robustness and/or environmental (developmental) robustness (McColgan & DiFrisco 2024).

      We will 1) extend our analysis of fitness under DSD by computing the mutational and developmental robustness (similar to Figure 2F) over time for a number of ancestral lineages. By comparing these two measures over evolutionary time we will gain a much more fine grained image of the evolutionary dynamics and should be able to find adaptive trends through gain of either type of robustness. Preliminary results suggest that during the plateaued fitness phase both mutational robustness and developmental robustness undergo weak gains and losses, likely due to the pleiotropic nature of our GPM. Collectively, these weak gains and losses result in the gain observed in Figure S3. So, rather than fully neutral we should discern (near-)neutral regimes in which clear adaptive steps are absent, but in which the sum of them is a net gain. These are interesting findings we initially missed, and give insights into how this high-dimensional fitness landscape is traversed, and will be included in a future revised version of the manuscript.

      2) We will run extra simulations without stochasticity to investigate DSD in the absence of adaptation through developmental robustness, and include the comparison between these and our original simulations in a future revised version.

      Finally 3) we will address stabilising selection more prominently in the introduction and discussion to accommodate these additional simulations.

      Reviewer 3 suggests that the model construction may favor DSD because there are many genes (14) of which only two determine fitness. I agree that some discussion on this point is warranted, though I am not sure enough is known about "the possible difference in constraints between the model and real development" for such a discussion to be on firm biological footing. A genetic architecture commonly found in quantitative genetic studies is that a small number of genes have large effects on the phenotype/fitness, whereas a very large number of genes have effects that are individually small but collectively large (see, e.g. literature surrounding the "omnigenic model" of complex traits). Implementing such an architecture is probably beyond the scope of the study here. More generally, would be natural to assume that the larger the number of genes, and the smaller the number of fitness-determining genes, the more likely DSD / re-wiring is to occur. That being said, I think the authors' choice of a 14-gene network is biologically defensible. It could be argued that the restriction of many modeling studies to small networks (often including just 3 genes) on the ground of convenience artificially ensures that DSD will not occur in these networks.

      The choice of 14 genes does indeed stem from a compromise between constraining the number of available genes, but at the same time allowing for sufficient degrees of freedom and redundancy. We have added a ‘modelling choices’ section in the discussion in which we address this point. Additionally, it is important to note that, while the fitness criterion only measures the pattern of 2 genes, throughout the evolutionary lineage additional genes become highly important for the fitness of an individual, because these genes evolved to help generate the target pattern (see for example Figure 4); the other genes indeed reflect reviewer 4’s point that most genes have a small effect. Crucially, we observe that even the genes and interactions that are important for fitness undergo DSD.

      Nevertheless, we think it is interesting to investigate this point of the influence of this particular modelling choice on the potential for DSD, and have set up an extra set of simulations with fewer gene types, and one with additional fitness genes.

      Furthermore, we discuss the choice of our network architecture more in depth in a discussion section on our modelling choices: ‘Modelling assumptions and choices’.

      Reviewer 1

      The observation of DSD in the computational models remains rather high-level in the sense that no motifs, mechanisms, subgraphs, mutations or specific dynamics are reported to be associated to it ---with the exception of gene expression domains overlapping. Perhaps the authors feel it is beyond this study, but a Results section with a more in-depth "mechanistic" analysis on what enables DSD would (a) make a better case for the extensive and expensive computational models and (b) would push this paper to a next level. As a starting point, it could be nice to check Ohno's intuition that gene duplications are a creative "force" in evolution. Are they drivers of DSD? Or are TFBS mutations responsible for the majority of cases?

      We agree that some mechanistic analysis would strengthen the manuscript, and will therefore extend the section ‘Network redundancy creates space for rewiring’ to address how this redundancy is facilitated. For instance, in the rewiring examples given in Figure 4 we can highlight how this new interaction emerges, if this is through a gene mutation followed by rewiring and loss of a redundant gene, or if the gain, redundancy and loss are all on the level of TFBS mutations. Effectively we will investigate which route of the three in the following schematic is most prominent:

      Additionally, we will do analysis on the different effects of the transcription dynamics for each of these routes. (note that this is not an exhaustive schematic, and combinations could be possible).

      l171. You discuss an example here, would it be possible to generalize this analysis and quantify the amount of DSD amongst all cloned populations? And related question: of the many conserved interactions in Fig 4A, how many do the two clonal lineages share? None? All?

      We agree that this is a good idea. In a new supplementary figure, we will show the number of times a conserved interaction gets lost, and a new interaction is gained as a metric for DSD in every cloned population.

      The populations in Fig 4A are cloned at generation 50.000, any interaction starting before then and still present at a point in time is shared. Any interactions starting after 50.000 are unique (or independently gained at least).

      - l269. What about phenotypic plasticity due to stochastic gene expression? Does it play a role in DSD in your model? I am thinking about https://pubmed.ncbi.nlm.nih.gov/24884746/ and https://pubmed.ncbi.nlm.nih.gov/21211007/

      We agree that this is an interesting point which should be included into the discussion. Following the comments of reviewer 3 we have set up extra simulations to investigate this in more detail, we will make sure to include these citations in the revised discussion when we have the results of those simulations.

      Reviewer 3

      Issue One: Interpretation of fitness gains under stabilising selection

      A central issue concerns how the manuscript defines and interprets developmental systems drift (DSD) in relation to evolution on the fitness landscape. The authors define DSD as the conservation of a trait despite changes in its underlying genetic basis, which is consistent with the literature. However, the manuscript would benefit from clarifying the relationship between DSD, genotype-to-phenotype maps, and fitness landscapes. Very simply, we can say that (i) DSD can operate along neutral paths in the fitness landscape, (ii) DSD can operate along adaptive paths in the fitness landscape. During DSD, these neutral or adaptive paths along the fitness landscape are traversed by mutations that change the gene regulatory network (GRN) and consequent gene expression patterns whilst preserving the developmental outcome, i.e., the phenotype. While this connection between DSD and fitness landscapes is referenced in the introduction, it is not fully elaborated upon. A complete elaboration is critical because, when I read the manuscript, I got the impression that the manuscript claims that DSD is prevalent along neutral paths in the fitness landscape, not just adaptive ones. If I am wrong and this is not what the authors claim, it should be explicitly stated in the results and discussed. Nevertheless, claiming DSD operates along neutral paths is a much more interesting statement than claiming it operates along adaptive paths. However, it requires sufficient evidence, which I have an issue with.

      The issue I have is about adaptations under stabilising selection. Stabilising selection occurs when there is selection to preserve the developmental outcome. Stabilising selection is essential to the results because evolutionary change in the GRN under stabilising selection should be due to DSD, not adaptations that change the developmental outcome. To ensure that the populations are under stabilising selection, the authors perform clonal experiments for 100,000 generations for 8 already evolved populations, 5 clones for each population. They remove 10 out of 40 clones because the fitness increase is too large, indicating that the developmental outcome changes over the 100,000 generations. However, the remaining 30 clonal experiments exhibit small but continual fitness increases over 100,000 generations. The authors claim that the remaining 30 are predominantly evolving due to drift, not adaptations (in the main text, line 137: "indicating predominantly neutral evolution", and section M: "too shallow for selection to outweigh drift"). The author's evidence for this claim is a mathematical analysis showing that the fitness gains are too small to be caused by beneficial adaptations, so evolution must be dominated by drift. I found this explanation strange, given that every clone unequivocally increases in fitness throughout the 100,000 generations, which suggests populations are adapting. Upon closer inspection of the mathematical analysis (section M), I believe it will miss many kinds of adaptations possible in their model, as I now describe.

      The mathematical analysis treats fitness as a constant, but it's a random variable in the computational model. Fitness is a random variable because gene transcription and protein translation are stochastic (Wiener terms in Eqs. (1)-(5)) and cell positions change for each individual (Methods C). So, for a genotype G, the realised fitness F is picked from a distribution with mean μ_G and higher order moments (e.g., variance) that determine the shape of the distribution. I think these assumptions lead to two problems.

      The first problem with the mathematical analysis is that F is replaced by an absolute number f_q, with beneficial mutations occurring in small increments denoted "a", representing an additive fitness advantage. The authors then take a time series of the median population fitness from their simulations and treat its slope as the individual's additive fitness advantage "a". The authors claim that drift dominates evolution because this slope is lower than a drift-selection barrier, which they derive from the mathematical analysis. This analysis ignores that the advantage "a" is a distribution, not a constant, which means that it does not pick up adaptations that change the shape of the distribution. Adaptations that change the shape of the distribution can be adaptations that increase robustness to stochasticity. Since there are multiple sources of noise in this model, I think it is highly likely that robustness to noise is selected for during these 100,000 generations.

      The second problem is that the mathematical analysis ignores traits that have higher-order effects on fitness. A trait has higher-order effects when it increases the fitness of the lineage (e.g., offspring) but not the parent. One possible trait that can evolve in this model with higher-order effects is mutational robustness, i.e., traits that lower the expected mutational load of descendants. Since many kinds of mutations occur in this model (Table 2), mutational robustness may be also evolving.

      Taken together, the analysis in Section M is set up to detect only immediate, deterministic additive gains in a single draw of fitness. It therefore cannot rule out weak but persistent adaptive evolution of robustness (to developmental noise and/or to mutations), and is thus insufficient evidence that DSD is occurring along neutral paths instead of adaptive paths. The small but monotonic fitness increases observed in all 40 clones are consistent with such adaptation (Fig. S3). The authors also acknowledge the evolution of robustness in lines 129-130 and 290-291, but the possibility of these adaptations driving DSD instead of neutral evolution is not discussed.

      To address the issue I have with adaptations during stabilising selection, the authors should, at a minimum, state clearly in their results that DSD is driven by both the evolution of robustness and drift. Moreover, a paragraph in the discussion should be dedicated to why this is the case, and why it is challenging to separate DSD through neutral evolution vs DSD through adaptations such as those that increase robustness.

      [OPTIONAL] A more thorough approach would be to make significant changes to the manuscript by giving sufficient evidence that the experimental clones are evolving by drift, or changing the model construction. One possible way to provide sufficient evidence is to improve the mathematical analysis. Another way is to show that the fitness distributions (both without and with mutations, like in Fig. 2F) do not significantly change throughout the 100,000 generations in experimental clones. It seems more likely that the model construction makes it difficult to separate the evolution of robustness from evolution by drift in the stabilising selection regime. Thus, I think the model should be constructed differently so that robustness against mutations and noise is much less likely to evolve after a "fitness plateau" is reached. This could be done by removing sources of noise from the model or reducing the kinds of possible mutations (related to issue two). In fact, I could not find justification in the manuscript for why these noise terms are included in the model, so I assume they are included for biological realism. If this is why noise is included, or if there is a separate reason why it is necessary, please write that in the model overview and/or the methods.

      We agree that we should be more precise about whether DSD operates along neutral vs adaptive paths in the fitness landscape, and have expanded our explanation of this distinction in the introduction. We also agree that it is worthwhile to distinguish between neutral evolution that does not change the fitness distribution of the population (either through changes in developmental or mutational robustness), higher-order evolutionary processes that increase developmental robustness, and drift along a neutral path in the fitness landscape towards regions of greater connectivity, resulting in mutational robustness (as described in Huynen et al., 1999). We have performed a preliminary analysis to identify changes in mutational robustness and developmental robustness over evolutionary time in the populations in which the maximum fitness has already plateaued. This analysis shows frequent weak gains and losses, in which clear adaptive steps are absent but a net gain can be seen in robustness, as consistent with higher-order fitness effects.

      To investigate the role of stabilising selection more in depth we will run simulations without developmental noise in the form of gene expression noise and tissue connectivity variation, thus removing the effect of the evolution of developmental robustness. We will compare the evolutionary dynamics of the GRNs with our original set of simulations, and include both these types of analyses in a supplementary figure of the revised manuscript.

      Furthermore, we now discuss the limitations of the mathematical analysis with regard to adaptation vs neutrality in our simulations, in the supplementary section.

      Issue two: The model construction may favour DSD

      In this manuscript, fitness is determined by the expression pattern of two types of genes (genes 12 and 13 in Table 1). There are 14 types of genes in total that can all undergo many kinds of mutations, including duplications (Table 2). Thus, gene regulatory networks (GRNs) encoded by genomes in this model tend to contain large numbers of interactions. The results show that most of these interactions have minimal effect on reaching the target pattern in high fitness individuals (e.g. Fig. 2F). A consequence of this is that only a minimal number of GRN interactions are conserved through evolution (e.g. Fig. 2D). From these model constructions and results from evolutionary simulations, we can deduce that there are very few constraints on the GRN. By having very few constraints on the GRN, I think it makes it easy for a new set of pattern-producing traits to evolve and subsequently for an old set of pattern-producing traits to be lost, i.e., DSD. Thus, I believe that the model construction may favour DSD.

      I do not have an issue with the model favouring DSD because it reflects real multicellular GRNs, where it is thought that a minority fraction of interactions are critical for fitness and the majority are not. However, it is unknown whether the constraints GRNs face in the model are more or less constrained than real GRNs. Thus, it is not known whether the prevalence of DSD in this model applies generally to real development, where GRN constraints depend on so many factors. At a minimum, the possible difference in constraints between the model and real development should be discussed as a limitation of the model. A more thorough change to the manuscript would be to test the effect of changing the constraints on the GRN. I am sure there are many ways to devise such a test, but I will give my recommendation here.

      [OPTIONAL] My recommendation is that the authors should run additional simulations with simplified mutational dynamics by constraining the model to N genes (no duplications and deletions), of which M out of these N genes contribute to fitness via the specific pattern (with M=2 in the current model). The authors should then test the effect of changing N and M independently, and how this affects the prevalence of DSD. If the prevalence of DSD is robust to changes in N and M, it supports the authors argument that DSD is highly prevalent in developmental evolution. If DSD prevalence is highly dependent on M and/or N, then the claims made in the manuscript about the prevalence of DSD must change accordingly. I acknowledge that these simulations may be computationally expensive, and I think it would be great if the authors knew (or devised) a more efficient way to test the effect of GRN constraints on DSD prevalence. Nevertheless, these additional simulations would make for a potentially very interesting manuscript.

      We agree that these modelling choices likely influence the potential for DSD. We think that our model setup, where most transcription factors are not under direct selection for a particular pattern, more accurately reflects biological development, where the outcome of the total developmental process (a functional organism) is what is under selection, rather than each individual gene pattern. As also mentioned by the referee, in real multicellular development the majority of interactions is not crucial for fitness, similar to our model. We also observe that, as fitness increases, additional genes experience emergent selection for particular expression patterns or interaction structures in the GRN, resulting in their conservation. Nevertheless, we do agree that the effect of model construction on DSD is an unexplored avenue and this work lends itself to addressing this. We will run additional sets of simulations: one in which we reduce the size of the network (‘N’), and a second set where we double the number of fitness contributing genes (‘M’), and show the effect on the extent of DSD in a future supplementary figure.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Referee cross commenting (Reviewer 4)

      Overall I agree with the comments of Reviewer 1, 2 and 3. I note that reviewers 1, 3, and 4 each pointed out the difficulties with assuming that CNSs = CREs, so this needs to be addressed. Two reviewers (3 and 4) also point out problems with equating bulk RNAseq with a conserved phenotype.

      We agree that caution is warranted with the assumption of CNSs = CREs. We have added a section to the discussion in which we discuss this more thoroughly, see ‘Limitations of CNSs as CREs’ in the revised manuscript.

      Additionally, we made textual changes to the statement of significance, abstract and results to better reflect when we talk about CNSs or CREs.

      I agree with Reviewer 1's hesitancy about the rhetorical framing of the paper potentially generalising too far from a computational model of plant meristem patterning.

      We agree that the title should reflect the scope of the manuscript, and our short title reflects that better than ubiquitous, which implies we investigated beyond plant (meristem) development. We have changed the title in the revised version, to ‘System drift in the evolution of plant meristem development’.

      Reviewer 1

      It is system drift, not systems drift (see True and Haag 2001). No 's' after system.

      Thank you for catching this – we corrected this throughout.

      - I am afraid I have a problem with the manuscript title. I think "Ubiquitoes" is misplaced, because it strongly suggests you have a long list of case studies across plants and animals, and some quantification of DSD in these two kingdoms. That would have been an interesting result, but it is not what you report. I suggest something along the lines of "System drift in the evolution of plant meristem development", similar to the short title used in the footer.

      - Alternatively, the authors may aim to say that DSD happens all over the place in computational models of development? In that case the title should reflect that the claim refers to modeling. (But what then about the data analysis part?)

      As remarked in the summary (point 2), we agree with this assessment and have changed the title to ‘System drift in the evolution of plant meristem development’’

      Multiple times in the Abstract and Introduction the authors make statements on "cis-regulatory elements" that are actually "conserved non-coding sequences" (CNS). Even if it is not uncommon for CNSs to harbor enhancers etc., I would be very hesitant to use the two as synonyms. As the authors state themselves, sequences, even non-coding, can be conserved for many reasons other than CREs. I would ask the authors to support better their use of "CREs" or adjust language. As roughly stated in their Discussion (lines 310-319), one way forward could be to show for a few CNS that are important in the analysis (of Fig 5), that they have experimentally-verified enhancers. Is that do-able or a bridge too far?

      We changed the text such that we use CNS instead of CRE when discussing the bioinformatic analysis. Additionally we added a section in the discussion to clarify the relationship between CNS and CRE.

      line 7. evo-devo is jargon

      We changed this to ‘…evolution of development (evo-devo) research…

      l9. I would think "using a computational model and data analysis"

      Yes, corrected.

      l13. Strictly speaking you did not look at CREs, but at conserved non-coding sequences.

      Indeed, we changed this to CNS.

      l14. "widespread" is exaggerated here, since you show for a single organ in a handful of plant species. You may extrapolate and argue that you do not see why it should not be widespread, but you did not show it. Or tie in all the known cases that can be found in literature.

      We understand that ‘widespread’ seems to suggest that we have investigated a broader range of species and organs. To be more accurate we changed the wording to ‘prevalent’.

      l16. "simpler" than what?

      We added the example of RNA folding.

      l27. Again the tension between CREs and non-coding sequence.

      Changed to conserved non coding sequence.

      l28. I don't understand the use of "necessarily" here.

      This is indeed confusing and unnecessary, removed

      l34-35. A very general biology statement is backed up by two modeling studies. I would have expected also a few based on comparative analyses (e.g., fossils, transcriptomics, etc).

      We added extra citations and a discussion of more experimental work

      l36. I was missing the work on "phenogenetic drift" by Weiss; and Pavlicev & Wagner 2012 on compensatory mutations.

      Changed the text to:

      This phenomenon is called developmental system drift (DSD) (True and Haag, 2001; McColgan and DiFrisco, 2024), or phenogenetic drift (Weiss and Fullerton, 2000), and can occur when multiple genotypes which are separated by few mutational steps encode the same phenotype, forming a neutral (Wagner, 2008a; Crombach et al., 2016); or adaptive path (Johnson and Porter, 2007; Pavlicev and Wagner, 2012) .

      l38. Kimura and Wagner never had a developmental process in mind, which is much bigger than a single nucleotide or a single gene, respectively. First paper that I am aware of that explicitly connects DSD to evolution on genotype networks is my own work (Crombach 2016), since the editor of that article (True, of True and Haag 2001) highlighted that point in our communications.

      Added citation and moved Kimura to the theoretical examples of protein folding DSD.

      l40. While Hunynen and Hogeweg definitely studied the GP map in many of their works, the term goes back to Pere Alberch (1991).

      Added citation.

      l54-55. I'm missing some motivation here. If one wants to look at multicellular structures that display DSD, vulva development in C. elegans and related worms is an "old" and extremely well-studied example. Also, studies on early fly development by Yogi Jaeger and his co-workers are not multicellular, but at least multi-nuclear. Obviously these are animal-based results, so to me it would make sense to make a contrast animal-plant regarding DSD research and take it from there.

      Indeed, DSD has been found in these species and we now reference some of this work; the principle is better known in animals. Nevertheless, within the theoretical literature there is a continuing debate on the importance/extent of DSD.

      Changed text:

      ‘For other GPMs, such as those resulting from multicellular development, it has been suggested that complex phenotypes are sparsely distributed in genotype space, and have low potential for DSD because the number of neutral mutations anti-correlates with phenotypic complexity (Orr, 2000; Hagolani et al., 2021). On the other hand, theoretical and experimental studies in nematodes and fruit flies have shown that DSD is present in a phenotypically complex context (Verster et al., 2014; Crombach et al., 2016; Jaeger, 2018). It therefore remains debated how much DSD actually occurs in species undergoing multicellular development. DSD in plants has received little attention. One multicellular structure which …’

      l66-86. It is a bit of a style-choice, but this is a looong summary of what is to come. I would not have done that. Instead, in the Introduction I would have expected a bit more digging into the concept of DSD, mention some of the old animal cases, perhaps summarize where in plants it should be expected. More context, basically.

      We extended the paragraph on empirical examples of DSD by adding the animal cases and condensed our summary.

      l108. Could you quantify the conserved interactions shared between the populations? Or is each simulation so different that they are pretty much unique?

      Each simulation here is independent of the other simulations, so a per interaction comparison would be uninformative. After cloning they do share ancestry, but that is much later in the manuscript and here the quantification of the conserved interactions would be the inverse of the divergence as shown in, for instance Figure 3B.

      l169. "DSD driving functional divergence" needs some context, since DSD is supposed to not affect function (of the final phenotype). Or am I misunderstanding?

      This is indeed a confusing sentence. We mean to say that DSD allows for divergence to such an extent that the underlying functional pathway is changed. So instead of a mere substitution of the underlying network, in which the topology and relative functions stay conserved, a different network structure is found. We have modified the line to read “Taken together, we found that DSD can drive functional divergence in the underlying GRN resulting in novel spatial expression dynamics of the genes not directly under selection.

      l176. Say which interaction it is. Is it 0->8, as mentioned in the next paragraph?

      It is indeed 0->8, we have clarified this in the text.

      l197. Bulk RNAseq has the problem of averaging gene expression over the population of cells. How do you think that impacts your test for rewiring? If you would do a similar "bulk RNA" style test on your computational models, would you pick up DSD?

      The rewiring is based on the CNSs, whereas the RNAseq is used as phenotype, so it does not impact the test for rewiring.

      The averaging of bulk RNAseq does however, mean that we cannot show conservation/divergence of the phenotype within the tissues, only between the different tissues.

      The most important implication of doing this in our model would be the definition of the ‘phenotype’ which undergoes DSD. Currently the phenotype is a gene expression pattern on a cellular level, for bulk RNA this phenotype would change to tissue-level gene expression.

      This change in what we measure as phenotype implicates how we interpret our results, but would not hinder us in picking up DSD, it just has a different meaning than DSD on a cellular - and single tissue scale.

      We added clarification of the roles of the datasets at the start of the paragraph.

      ‘The Conservatory Project collects conserved non-coding sequences (CNSs) across plant genomes, which we used to investigate the extent of GRN rewiring in flowering plants. Schuster et al. measured gene expression in different homologous tissues of several species via bulk RNAseq, which we used to test for gene expression (phenotype) conservation, and how this relates to the GRN rewiring inferred from the CNSs.’

      l202. I do not understand the "within" of a non-coding sequence within an orthogroup. How are non-coding sequences inside an orthogroup of genes?

      We clarify this sentence by saying ‘A CNS is defined as a non-coding sequence conserved within the upstream/downstream region of genes within an orthogroup’, to more clearly separate the CNS from the orthogroup of genes. We also updated Figure 5A to reflect this better.

      l207-217. This paragraph is difficult to read and would benefit of a rephrasing. Plant-specific jargon, numbers do not add up (line 211), statements are rather implicit (9 deeply conserved CNS are the 3+6? Where do I see them in Fig 5B? And where do I see the lineage-specific losses?).

      We added extra annotations to the figure to make the plant jargon (angiosperm, eudicot, Brassicaceae) clear, and show the loss more clearly in the figure. We also clarified the text by splitting up 9 to 3 and 6.

      l223. Looking at the shared CNS between SEP1-2, can you find a TF binding site or another property that can be interpreted as regulatory importance?

      Reliably showing an active TF binding site would require experimental data, which we don’t have. We do mention in the discussion the need for datasets which could help address this gap.

      l225. My intuition says that the continuity of the phenotype may not be necessary if its loss can be compensated for somehow by another part of the organism. I.e., DSD within DSD. It is a poorly elaborated thought, I leave it here for your information. Perhaps a Discussion point?

      Although very interesting we think this discussion might be outside of the scope of this work, and would benefit from a standalone discussion – especially since the capacity for such compensation might differ between animals and plants (which are more “modular” organisms). This is our interpretation:

      First, let’s take a step back from ‘genotype’ and ‘phenotype’ and redefine DSD more generally: in a system with multiple organisational levels, where a hierarchical mapping between them exists, DSD is changes on one organisational level which do not alter the outcome of the ‘higher’ organisational level. In other words, DSD can exist any many-to-one mapping in which a set of many (which map to the same one) are within a certain distance in space, which we generally define as a single mutational step.

      Within this (slightly) more general definition we can extend the definition of DSD to the level of phenotype and function, in which phenotype describes the ‘many’ layer, and multiple phenotypes can fulfill the same function. When we are freed from the constraint of ‘genotype’ and ‘phenotype’, and DSD is defined at the level of this mapping, than it becomes an easy exercise to have multiple mappings (genotype→phenotype→function) and thus ‘DSD within DSD’.

      l233. "rarely"? I don't see any high Pearson distances.

      True in the given example there are no high Pearson distances, however some of the supplementary figures do so rarely felt like the most honest description. We changed the text to refer to these supplementary figures.

      Fig 4. Re-order of panels? I was expecting B at C and vice versa.

      Agreed, we swapped the order of the panels

      Fig 5B. Red boxes not explained. Mention that it is an UpSetplot?

      We added clarification to the figure caption.

      Fig 5D. It would be nice to quantify the minor and major diffs between orthologs and paralogs.

      We quantify the similarities (and thus differences) in Figure F, but we do indeed not show orthologs vs paralogs explicitly. We have extended Figure F to distinguish which comparisons are between orthologs vs paralogs with different tick marks, which shows their different distributions quite clearly.

      - l247. Over-generalization. In a specific organ of plants...

      Changed to vascular plant meristem.

      - l249. Where exactly is this link between diverse expression patterns and the Schuster dataset made? I suggest the authors to make it more explicit in the Results.

      We are slightly overambitious in this sentence. The Schuster dataset confirms the preservation of expression where the CNS dataset shows rewiring. That this facilitates diversification of expression patterns in traits not under selection is solely an outcome of the computational model. We have changed the text to reflect this more clearly.

      - l268. Final sentence of the paragraph left me puzzled. Why talk about opposite function?

      The goal here was to highlight regulatory rewiring which, in the most extreme case, would achieve an opposite function for a given TF within development. We agree that this was formulated vaguely so we rewrote this to be more to the point.

      These examples demonstrate that whilst the function of pathways is conserved, their regulatory wiring often is not.

      - l269. What about time scales generated by the system? Looking at Fig 2C and 2D, the elbow pattern is pretty obvious. That means interactions sort themselves into either short-lived or long-lived. Worth mentioning?

      Added a sentence to highlight this.

      - l291. Evolution in a *constant* fitness landscape increases robustness.

      Changed

      - l296. My thoughts, for your info: I suspect morphogenesis as single parameters instead of as mechanisms makes for a brittle landscape, resulting in isolated parts of the same phenotype.

      We agree, and now include citations to different models in which morphogenesis evolves which seem to display a more connected landscape.

      Reviewer 2

      Every computational model necessarily makes some simplifying assumptions. It would be nice if the authors could summarise in a paragraph in the Discussion the main assumptions made by their model, and which of those are most worth revisiting in future studies. In the current draft, some assumptions are described in different places in the manuscript, which makes it hard for a non-expert to evaluate the limitations of this model.

      We added a section to the discussion: ‘Modelling assumptions and choices’

      I did not find any mention of potential energetic constraints or limitations in this model. For example, I would expect high levels of gene expression to incur significant energy costs, resulting in evolutionary trade-offs. Could the authors comment on how taking energy limitations into account might influence their results?

      This would put additional constraints on the evolution/fitness landscape. Some paths/regions of the fitness landscape which are currently accessible will not be traversable anymore. On the other hand, an energy constraint might reduce certain high fitness areas to a more even plane and thus make it more traversable. During analysis of our data there were no signs of extremely high gene expression levels.

      Figure 3C lists Gene IDs 1, 2, 8, and 11, but the caption refers to genes 1, 2, 4, and 11.

      Thank you for catching this.

      Reviewer 3

      The authors present an analysis correlating conserved non-coding sequence (CNS) composition with gene expression to investigate developmental systems drift. One flaw of this analysis is that it uses deeply conserved sequences as a proxy for the entire cis-regulatory landscape. The authors acknowledge this flaw in the discussion.

      Another potential flaw is equating the bulk RNA-seq data with a conserved phenotype. In lines 226-227 of the manuscript, it is written that "In line with our computational model, we compared gene expression patterns to measure changes in phenotype." I am not sure if there is an equivalence between the two. In the computational model, the developmental outcome determining fitness is a spatial pattern, i.e., an emergent product of gene expression and cell interactions. In contrast, the RNA-seq data shows bulk measurements in gene expression for different organs. It is conceivable that, despite having very similar bulk measurements, the developmental outcome in response to gene expression (such as a spatial pattern or morphological shape) changes across species. I think this difference should be explicitly addressed in the discussion. The authors may have intended to discuss this in lines 320-326, although it is unclear to me.

      It is correct that the CNS data and RNA-seq data has certain limitations, and the brief discussion of some of these limitations in lines 320-326 is not sufficient. We have been more explicit on this point in the discussion.

      The gene expression data used in this study represents bulk expression at the organ level, such as the vegetative meristem (Schuster et al., 2024). This limits our analysis of the phenotypic effects of rewiring to comparisons between organs, which is different to our computational simulations where we look at within organ gene expression. Additionally, the bulk RNA-seq does not allow us to discern whether the developmental outcome of similar gene expression is the same in all these species. More fine-grained approaches, such as single-cell RNA sequencing or spatial transcriptomics, will provide a more detailed understanding of how gene expression is modulated spatially and temporally within complex tissues of different organisms, allowing for a closer alignment between computational predictions and experimental observations.

      Can the authors justify using these six species in the discussion or the results? Are there any limitations with choosing four closely related and two distantly related species for this analysis, in contrast to, say, six distantly related species? If so, please elaborate in the discussion.

      The use of these six species is mainly limited by the datasets we have available. Nevertheless, the combination of four closely related species, and two more distantly related species gives a better insight into the short vs long term divergence dynamics than six distantly related species would. We have noted this when introducing the datasets:

      This set of species contains both closely (A. thaliana, A. lyrata, C. rubella, E. salsugineum) and more distantly related species (M. truncatula, B. distachyon), which should give insight in short and long term divergence.

      In Figure S7, some profiles show no conservation across the six species. Can we be sure that a stabilising selection pressure conserves any CNSs? Is it possible that the deeply conserved CNSs mentioned in the main text are conserved by chance, given the large number of total CNSs? A brief comment on these points in the results or discussion would be helpful.

      In our simulations, we find that even CREs that were under selection for a long time can disappear; however, in our neutral simulations, CREs were not conserved, suggesting that deep conservation is the result of selection. When it comes to CNSs, the assumption is that they often contain CREs that are under selection.We have added a more elaborate section on CNSs in the discussion. See ‘Limitations of CNSs as CREs

      Line 7-8: I thought this was a bit difficult to read. The connection between (i) evolvability of complex phenotypes, (ii) neutral/beneficial change hindered by deleterious mutations, and (iii) DSD might not be so simple for many readers, so I think it should be rewritten. The abstract was well written, though.

      We made the connection to DSD and evolvability clearer and removed the specific mutational outcomes:

      *A key open question in evolution of development (evo-devo) is the evolvability of complex phenotypes. Developmental system drift (DSD) may contribute to evolvability by exploring different genotypes with similar phenotypic outcome, but with mutational neighbourhoods that have different, potentially adaptive, phenotypes. We investigated the potential for DSD in plant development using a computational model and data analysis. *

      Line 274 vs 276: Is there a difference between regulatory dynamics and regulatory mechanisms?

      No, we should use the same terminology. We have changed this to be clearer.

      Figure S4: Do you expect the green/blue lines to approach the orange line in the long term? In some clonal experiments, it seems like it will. In others, it seems like it has plateaued. Under continual DSD, I assume they should converge. It would be interesting to see simulations run sufficiently long to see if this occurs.

      In principle yes, however this might take a considerable amount of time given that some conserved interactions take >75000 generations to be rewired.

      Line 27: Evolutionarily instead of evolutionary?

      Changed

      Line 67-68: References in brackets?

      Changed

      Line 144: Capitalise "fig"

      Changed

      Fig. 3C caption: correct "1, 2, 4, 11" (should be 8)

      Changed

      Line 192: Reference repeated

      Changed

      Fig. 5 caption: Capitalise "Supplementary figure"

      Changed

      Line 277: Correct "A previous model Johnson.."

      Changed

      Line 290: Brackets around reference

      Changed

      Line 299: Correct "will be therefore be"

      Changed

      Line 394: Capitalise "table"

      Changed

      Line 449: Correct "was build using"

      Changed

      Fig. 5B: explain the red dashed boxes in the caption

      Added explanation to the caption

      Some of the Figure panels might benefit from further elaboration in their respective captions, such as 3C and 5F.

      Improved the figure captions.

      Reviewer 4

      Statement of significance. The logical connection between the first two sentences is not clear. What does developmental system drift have to do with neutral/beneficial mutations?

      This is indeed an unclear jump. Changed such that the connection between evolvability of complex phenotypes and DSD is more clear:

      *A key open question in evolution of development (evo-devo) is the evolvability of complex phenotypes. Developmental system drift (DSD) contributes to evolvability by exploring different genotypes with similar phenotypic outcome, but with mutational neighbourhoods that have different, potentially adaptive, phenotypes..We investigated the potential for DSD in plant development using a computational model and data analysis. *

      l 41 - "DSD is found to ... explain the developmental hourglass." Caution is warranted here. Wotton et al 2015 claim that "quantitative system drift" explains the hourglass pattern, but it would be more accurate to say that shifting expression domains and strengths allows compensatory regulatory change to occur with the same set of genes (gap genes). It is far from clear how DSD could explain the developmental hourglass pattern. What does DSD imply about the causes of differential conservation of different developmental stages? It's not clear there is any connection here.

      We should indeed be more cautious here. DSD is indeed not in itself an explanation of the hourglass model, but only a mechanism by which the developmental divergence observed in the hourglass model could have emerged. As per Pavlicev and Wagner, 2012, compensatory changes resulting from other shifts would fall under DSD, and can explain how the patterning outcome of the gap gene network is conserved. However, this does not explain why some stages are under stronger selection than others. We changed the text to reflect this.

      ‘...be a possible evolutionary mechanism involved in the developmental hourglass model (Wotton et al., 2015; Crombach et al., 2016)...’

      ll 51-53 - "Others have found that increased complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022)." Does this refer to increased genomic complexity or increased phenotypic complexity? It is not clear that increased phenotypic complexity allows a greater number of genotypes to produce the same phenotype. Please explain further.

      The paragraph discusses complexity in the GPM as a whole, where the first few examples in the paragraph regard phenotypic complexity, and the ones in l51-53 refer to genomic complexity. This is currently not clear so we clarified the text.

      ‘For other GPMs, such as those resulting from multicellular development, it has been suggested that complex phenotypes are sparsely distributed in genotype space, and have low potential for DSD because the number of neutral mutations anti-correlates with phenotypic complexity (Orr, 2000; Hagolani et al., 2021). Others have found that increased genomic complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022).’

      It was not clear why some gene products in the model have the ability to form dimers. What does this contribute to the simulation results? This feature is introduced early on, but is not revisited. Is it necessary?

      *Fitness. The way in which fitness is determined in the model was not completely clear to me. *

      Dimers are not necessary, but as they have been found to play a role in actual SAM development we added them to increase the realism of the developmental simulations. In some simulations the patterning mechanism involves the dimer, in others it does not, suggesting that dimerization is not essential for DSD.

      We have made changes to the methods to clarify fitness.

      Lines 103-104 say: "Each individual is assigned a fitness score based on the protein concentration of two target genes in specific regions of the SAM: one in the central zone (CZ), and one in the organizing center (OC)." How are these regions positionally defined in the simulation?

      We have defined bounding boxes to define cells as either CZ, OC or both. We have added these bounds in the figure description and more clearly in the revised methods.

      F, one reads (l. 385): "Fitness depends on the correct protein concentration of the two fitness genes in each cell, pcz and poc respectively." This sounds like fitness is determined by the state of all cells rather than the state of the two specific regions of the SAM. Please clarify.

      A fitness penalty is given for incorrect expression so it is true that the fitness is determined by the state of all cells. We agree that it is phrased unclearly and have clarified this in the text.

      The authors use conserved non-coding sequences as a proxy for cis-regulatory elements. More specification of how CNSs were assigned to an orthogroup seems necessary in this section. Is assignment based on proximity to the coding region? Of course the authors will appreciate that regulatory elements can be located far from the gene they regulate. This data showed extensive gains and losses of CNS. It might be interesting to consider how much of this is down to transposons, in which case rapid rearrangement is not unexpected. A potential problem with the claim that the data supports the simulation results follows from the fact that DSD is genetic divergence despite trait conservation, but conserved traits appear to have only been defined or identified in the case of the SEP genes. It can't be ruled out that divergence in CNSs and in gene expression captured by the datasets is driven by straightforward phenotypic adaptation, thus not by DSD. Further caution on this point is needed.

      CNSs are indeed assigned based on proximity up to 50kb, the full methods are described in detail in Hendelman et al., (2021). CREs can be located further than 50kb, but evidence suggests that this is rare for species with smaller genomes.

      In the cases where both gene expression and the CNSs diverged it can indeed not be ruled out that there has been phenotypic adaptation. We clarified in the text that the lower Pearson distances are informative for DSD as they highlight conserved phenotypes.

      l. 290-291 - "However, evolution has been shown to increase mutational robustness over time, resulting in the possibility for more neutral change." It is doubtful that there is any such unrestricted trend. If mutational robustness only tended to increase, new mutations would not affect the phenotype, and phenotypes would be unable to adapt to novel environments. Consider rethinking this statement.

      We have reformulated this statement, since it is indeed not expected that this trend is indefinite. Infinite robustness would indeed result in the absence of evolvability; however, it has been shown for other genotype-phenotype maps that mutational robustness, where a proportion of mutations is neutral, aids the evolution of novel traits. The evolution of mutational robustness also depends on population size and mutation rate. This trend will, most probably, also be stronger in modelling work where the fitness function is fixed, compared to a real life scenario where ‘fitness’ is much less defined and subject to continuous change. We added ‘constant’ to the fitness landscape to highlight this disparity.

      ll. 316-317 "experimental work investigating the developmental role of CREs has shown extensive epistasis - where the effect of a mutation depends on the genetic background - supporting DSD." How does extensive epistasis support DSD? One can just as easily imagine scenarios where high interdependence between genes would prevent DSD from occurring. Please explain further.

      We should be more clear. Experimental work has shown that the effect of mutating a particular CRE strongly depends on the genetic background, also known as epistasis. Counterintuitively, this indirectly supports the presence of DSD, since it means that different species or strains have slightly different developmental mechanisms, resulting in these different mutational effects. We have shown how epistatic effects shift over evolutionary time.

      Overall I found the explanation of the Methods, especially the formal aspects, to be unclear at times and would recommend that the authors go back over the text to improve its clarity.

      We rewrote parts of the methods and some of the equations to be more clear and cohesive throughout the text.

      C. Tissue Generation. Following on the comment on fitness above, it would be advisable to provide further details on how cell positions are defined. How much do the cells move over the course of the simulation? What is the advantage of modelling the cells as "springs" rather than as a simple grid?

      The tissue generation is purely a process to generate a database of tissue templates: the random positions, springs and voronoi method serve the purpose of having similar but different tissues to prevent unrealistic overfitting of our GRNs on a single topology. For each individual’s development however, only one, unchanging template is used. We clarified this in the methods.

      E. Development of genotype into phenotype. The diffusion term in the SDE equations is hard to understand as no variable for spatial position (x) is included in the equation. It seems this equation should rather be an SPDE with a position variable and a specified boundary condition (i.e. the parabola shape). In eq. 5 it should be noted that the Wi are independent. Also please justify the choice of how much noise/variance is being stipulated here.

      We have rewritten parts of this section for clarity and added citations.

      F. Fitness function. I must say I found formula 7 to be unclear. It looks like fi is the fitness of cell(s) but, from Section G, fitness is a property of the individual. It seems formula 7 should define fi as a sum over the cell types or should capture the fitness contribution of the cell types.

      Correct. We have rewritten this equation. We’ll define fi as the fitness contribution of a cell, F as the sum of fi, so the fitness of an individual, and use F in function 8.

      What is the basis for the middle terms (fractions) in the equation? After plugging in the values for pcz and poc, this yields a number, but how does that number assign a cell to one of the types? If a reviewer closely scrutinizing this section cannot make sense of it, neither will readers. Please explain further.

      The cell type is assigned based on the spatial location of the cell, and the correct fitness function for each of these cell types is described in this equation. We have clarified the text and functions.

      A minor note: it would be best practice not to re-use variables to refer to different things within the same paper. For example p refers to protein concentration but also probability of mutation.

      Corrected

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      In "Ubiquitous system drift in the evolution of development," van der Jagt et al. report a large-scale simulation study of the evolution of gene networks controlling a developmental patterning process. The 14-gene simulation shows interesting results: continual rewiring of the network and establishment of essential genes which themselves are replaced on long time scales. The authors suggest that this result is validated by plant genome and expression data from some public datasets. Overall, this study lends support to the idea that developmental system drift may be more pervasive in the evolution of complex gene networks than is currently appreciated.

      I have a number of comments, mostly of a clarificatory nature, that the authors can consider in revision.

      1. Intro

      Statement of significance. The logical connection between the first two sentences is not clear. What does developmental system drift have to do with neutral/beneficial mutations?

      l 41 - "DSD is found to ... explain the developmental hourglass." Caution is warranted here. Wotton et al 2015 claim that "quantitative system drift" explains the hourglass pattern, but it would be more accurate to say that shifting expression domains and strengths allows compensatory regulatory change to occur with the same set of genes (gap genes). It is far from clear how DSD could explain the developmental hourglass pattern. What does DSD imply about the causes of differential conservation of different developmental stages? It's not clear there is any connection here.

      ll 51-53 - "Others have found that increased complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022)." Does this refer to increased genomic complexity or increased phenotypic complexity? It is not clear that increased phenotypic complexity allows a greater number of genotypes to produce the same phenotype. Please explain further. 2. Model

      It was not clear why some gene products in the model have the ability to form dimers. What does this contribute to the simulation results? This feature is introduced early on, but is not revisited. Is it necessary?

      Fitness. The way in which fitness is determined in the model was not completely clear to me. Lines 103-104 say: "Each individual is assigned a fitness score based on the protein concentration of two target genes in specific regions of the SAM: one in the central zone (CZ), and one in the organizing center (OC)." How are these regions positionally defined in the simulation? In Methods section F, one reads (l. 385): "Fitness depends on the correct protein concentration of the two fitness genes in each cell, pcz and poc respectively." This sounds like fitness is determined by the state of all cells rather than the state of the two specific regions of the SAM. Please clarify. 3. Data

      The authors use conserved non-coding sequences as a proxy for cis-regulatory elements. More specification of how CNSs were assigned to an orthogroup seems necessary in this section. Is assignment based on proximity to the coding region? Of course the authors will appreciate that regulatory elements can be located far from the gene they regulate. This data showed extensive gains and losses of CNS. It might be interesting to consider how much of this is down to transposons, in which case rapid rearrangement is not unexpected. A potential problem with the claim that the data supports the simulation results follows from the fact that DSD is genetic divergence despite trait conservation, but conserved traits appear to have only been defined or identified in the case of the SEP genes. It can't be ruled out that divergence in CNSs and in gene expression captured by the datasets is driven by straightforward phenotypic adaptation, thus not by DSD. Further caution on this point is needed. 4. Discussion

      ll. 290-291 - "However, evolution has been shown to increase mutational robustness over time, resulting in the possibility for more neutral change." It is doubtful that there is any such unrestricted trend. If mutational robustness only tended to increase, new mutations would not affect the phenotype, and phenotypes would be unable to adapt to novel environments. Consider rethinking this statement.

      ll. 316-317 "experimental work investigating the developmental role of CREs has shown extensive epistasis - where the effect of a mutation depends on the genetic background - supporting DSD." How does extensive epistasis support DSD? One can just as easily imagine scenarios where high interdependence between genes would prevent DSD from occurring. Please explain further. 5. Methods

      Overall I found the explication of the Methods, especially the formal aspects, to be unclear at times and would recommend that the authors go back over the text to improve its clarity.

      C. Tissue Generation. Following on the comment on fitness above, it would be advisable to provide further details on how cell positions are defined. How much do the cells move over the course of the simulation? What is the advantage of modelling the cells as "springs" rather than as a simple grid?

      E. Development of genotype into phenotype. The diffusion term in the SDE equations is hard to understand as no variable for spatial position (x) is included in the equation. It seems this equation should rather be an SPDE with a position variable and a specified boundary condition (i.e. the parabola shape). In eq. 5 it should be noted that the Wi are independent. Also please justify the choice of how much noise/variance is being stipulated here.

      F. Fitness function. I must say I found formula 7 to be unclear. It looks like fi is the fitness of cell(s) but, from Section G, fitness is a property of the individual. It seems formula 7 should define fi as a sum over the cell types or should capture the fitness contribution of the cell types.

      What is the basis for the middle terms (fractions) in the equation? After plugging in the values for pcz and poc, this yields a number, but how does that number assign a cell to one of the types? If a reviewer closely scrutinizing this section cannot make sense of it, neither will readers. Please explain further.

      A minor note: it would be best practice not to re-use variables to refer to different things within the same paper. For example p refers to protein concentration but also probability of mutation.

      Referee cross-commenting

      Overall I agree with the comments of Reviewer 1, 2 and 3. I note that reviewers 1, 3, and 4 each pointed out the difficulties with assuming that CNSs = CREs, so this needs to be addressed. Two reviewers (3 and 4) also point out problems with equating bulk RNAseq with a conserved phenotype.

      I agree with Reviewer 1's hesitancy about the rhetorical framing of the paper potentially generalising too far from a computational model of plant meristem patterning.

      Reviewer 3's concern about DSD resulting from stabilising selection for robustness is something I missed -- this is important and should be addressed.

      Reviewer 3 suggests that the model construction may favor DSD because there are many genes (14) of which only two determine fitness. I agree that some discussion on this point is warranted, though I am not sure enough is known about "the possible difference in constraints between the model and real development" for such a discussion to be on firm biological footing. A genetic architecture commonly found in quantitative genetic studies is that a small number of genes have large effects on the phenotype/fitness, whereas a very large number of genes have effects that are individually small but collectively large (see, e.g. literature surrounding the "omnigenic model" of complex traits). Implementing such an architecture is probably beyond the scope of the study here. More generally, would be natural to assume that the larger the number of genes, and the smaller the number of fitness-determining genes, the more likely DSD / re-wiring is to occur. That being said, I think the authors' choice of a 14-gene network is biologically defensible. It could be argued that the restriction of many modeling studies to small networks (often including just 3 genes) on the ground of convenience artificially ensures that DSD will not occur in these networks.

      I agree with the other reviewers on the overall positive assessment of the significance of the manuscript. There are many points to address and revise, but the core setup and result of this study is sound and should be published.

      Significance

      In "Ubiquitous system drift in the evolution of development," van der Jagt et al. report a large-scale simulation study of the evolution of gene networks controlling a developmental patterning process. The 14-gene simulation shows interesting results: continual rewiring of the network and establishment of essential genes which themselves are replaced on long time scales. The authors suggest that this result is validated by plant genome and expression data from some public datasets. Overall, this study lends support to the idea that developmental system drift may be more pervasive in the evolution of complex gene networks than is currently appreciated.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript uses an Evo-Devo model of the plant apical meristem to explore the potential for developmental systems drift (DSD). DSD occurs when the genetic underpinnings of development change through evolution while reaching the same developmental outcome. The mechanisms underlying DSD are theoretically intriguing and highly relevant for our understanding of how multicellular species evolve. The manuscript shows that DSD occurs extensively and continuously in their evolutionary simulations whilst populations evolve under stabilising selection. The authors examine regulatory rewiring across plant angiosperms to link their theoretical model with real data. The authors claim that, despite the conservation of genetic wiring in angiosperm species over shorter evolutionary timescales, this genetic wiring changes over long evolutionary timescales due to DSD, which is consistent with their theoretical model.

      Major comments:

      I enjoyed reading the author's approach to understanding DSD and the link to empirical data. I think it is a very important line of investigation that deserves more theoretical and experimental attention. All the data and methods are clearly presented, and the software for the research is publicly available. Sufficient information is given to reproduce all results. However, I have two major issues relating to the theoretical part of the research.

      Issue One: Interpretation of fitness gains under stabilising selection

      A central issue concerns how the manuscript defines and interprets developmental systems drift (DSD) in relation to evolution on the fitness landscape. The authors define DSD as the conservation of a trait despite changes in its underlying genetic basis, which is consistent with the literature. However, the manuscript would benefit from clarifying the relationship between DSD, genotype-to-phenotype maps, and fitness landscapes. Very simply, we can say that (i) DSD can operate along neutral paths in the fitness landscape, (ii) DSD can operate along adaptive paths in the fitness landscape. During DSD, these neutral or adaptive paths along the fitness landscape are traversed by mutations that change the gene regulatory network (GRN) and consequent gene expression patterns whilst preserving the developmental outcome, i.e., the phenotype. While this connection between DSD and fitness landscapes is referenced in the introduction, it is not fully elaborated upon. A complete elaboration is critical because, when I read the manuscript, I got the impression that the manuscript claims that DSD is prevalent along neutral paths in the fitness landscape, not just adaptive ones. If I am wrong and this is not what the authors claim, it should be explicitly stated in the results and discussed. Nevertheless, claiming DSD operates along neutral paths is a much more interesting statement than claiming it operates along adaptive paths. However, it requires sufficient evidence, which I have an issue with. The issue I have is about adaptations under stabilising selection. Stabilising selection occurs when there is selection to preserve the developmental outcome. Stabilising selection is essential to the results because evolutionary change in the GRN under stabilising selection should be due to DSD, not adaptations that change the developmental outcome. To ensure that the populations are under stabilising selection, the authors perform clonal experiments for 100,000 generations for 8 already evolved populations, 5 clones for each population. They remove 10 out of 40 clones because the fitness increase is too large, indicating that the developmental outcome changes over the 100,000 generations. However, the remaining 30 clonal experiments exhibit small but continual fitness increases over 100,000 generations. The authors claim that the remaining 30 are predominantly evolving due to drift, not adaptations (in the main text, line 137: "indicating predominantly neutral evolution", and section M: "too shallow for selection to outweigh drift"). The author's evidence for this claim is a mathematical analysis showing that the fitness gains are too small to be caused by beneficial adaptations, so evolution must be dominated by drift. I found this explanation strange, given that every clone unequivocally increases in fitness throughout the 100,000 generations, which suggests populations are adapting. Upon closer inspection of the mathematical analysis (section M), I believe it will miss many kinds of adaptations possible in their model, as I now describe. The mathematical analysis treats fitness as a constant, but it's a random variable in the computational model. Fitness is a random variable because gene transcription and protein translation are stochastic (Wiener terms in Eqs. (1)-(5)) and cell positions change for each individual (Methods C). So, for a genotype G, the realised fitness F is picked from a distribution with mean μ_G and higher order moments (e.g., variance) that determine the shape of the distribution. I think these assumptions lead to two problems. The first problem with the mathematical analysis is that F is replaced by an absolute number f_q, with beneficial mutations occurring in small increments denoted "a", representing an additive fitness advantage. The authors then take a time series of the median population fitness from their simulations and treat its slope as the individual's additive fitness advantage "a". The authors claim that drift dominates evolution because this slope is lower than a drift-selection barrier, which they derive from the mathematical analysis. This analysis ignores that the advantage "a" is a distribution, not a constant, which means that it does not pick up adaptations that change the shape of the distribution. Adaptations that change the shape of the distribution can be adaptations that increase robustness to stochasticity. Since there are multiple sources of noise in this model, I think it is highly likely that robustness to noise is selected for during these 100,000 generations. The second problem is that the mathematical analysis ignores traits that have higher-order effects on fitness. A trait has higher-order effects when it increases the fitness of the lineage (e.g., offspring) but not the parent. One possible trait that can evolve in this model with higher-order effects is mutational robustness, i.e., traits that lower the expected mutational load of descendants. Since many kinds of mutations occur in this model (Table 2), mutational robustness may be also evolving. Taken together, the analysis in Section M is set up to detect only immediate, deterministic additive gains in a single draw of fitness. It therefore cannot rule out weak but persistent adaptive evolution of robustness (to developmental noise and/or to mutations), and is thus insufficient evidence that DSD is occurring along neutral paths instead of adaptive paths. The small but monotonic fitness increases observed in all 40 clones are consistent with such adaptation (Fig. S3). The authors also acknowledge the evolution of robustness in lines 129-130 and 290-291, but the possibility of these adaptations driving DSD instead of neutral evolution is not discussed. To address the issue I have with adaptations during stabilising selection, the authors should, at a minimum, state clearly in their results that DSD is driven by both the evolution of robustness and drift. Moreover, a paragraph in the discussion should be dedicated to why this is the case, and why it is challenging to separate DSD through neutral evolution vs DSD through adaptations such as those that increase robustness. [OPTIONAL] A more thorough approach would be to make significant changes to the manuscript by giving sufficient evidence that the experimental clones are evolving by drift, or changing the model construction. One possible way to provide sufficient evidence is to improve the mathematical analysis. Another way is to show that the fitness distributions (both without and with mutations, like in Fig. 2F) do not significantly change throughout the 100,000 generations in experimental clones. It seems more likely that the model construction makes it difficult to separate the evolution of robustness from evolution by drift in the stabilising selection regime. Thus, I think the model should be constructed differently so that robustness against mutations and noise is much less likely to evolve after a "fitness plateau" is reached. This could be done by removing sources of noise from the model or reducing the kinds of possible mutations (related to issue two). In fact, I could not find justification in the manuscript for why these noise terms are included in the model, so I assume they are included for biological realism. If this is why noise is included, or if there is a separate reason why it is necessary, please write that in the model overview and/or the methods.

      Issue two: The model construction may favour DSD

      In this manuscript, fitness is determined by the expression pattern of two types of genes (genes 12 and 13 in Table 1). There are 14 types of genes in total that can all undergo many kinds of mutations, including duplications (Table 2). Thus, gene regulatory networks (GRNs) encoded by genomes in this model tend to contain large numbers of interactions. The results show that most of these interactions have minimal effect on reaching the target pattern in high fitness individuals (e.g. Fig. 2F). A consequence of this is that only a minimal number of GRN interactions are conserved through evolution (e.g. Fig. 2D). From these model constructions and results from evolutionary simulations, we can deduce that there are very few constraints on the GRN. By having very few constraints on the GRN, I think it makes it easy for a new set of pattern-producing traits to evolve and subsequently for an old set of pattern-producing traits to be lost, i.e., DSD. Thus, I believe that the model construction may favour DSD. I do not have an issue with the model favouring DSD because it reflects real multicellular GRNs, where it is thought that a minority fraction of interactions are critical for fitness and the majority are not. However, it is unknown whether the constraints GRNs face in the model are more or less constrained than real GRNs. Thus, it is not known whether the prevalence of DSD in this model applies generally to real development, where GRN constraints depend on so many factors. At a minimum, the possible difference in constraints between the model and real development should be discussed as a limitation of the model. A more thorough change to the manuscript would be to test the effect of changing the constraints on the GRN. I am sure there are many ways to devise such a test, but I will give my recommendation here. [OPTIONAL] My recommendation is that the authors should run additional simulations with simplified mutational dynamics by constraining the model to N genes (no duplications and deletions), of which M out of these N genes contribute to fitness via the specific pattern (with M=2 in the current model). The authors should then test the effect of changing N and M independently, and how this affects the prevalence of DSD. If the prevalence of DSD is robust to changes in N and M, it supports the authors argument that DSD is highly prevalent in developmental evolution. If DSD prevalence is highly dependent on M and/or N, then the claims made in the manuscript about the prevalence of DSD must change accordingly. I acknowledge that these simulations may be computationally expensive, and I think it would be great if the authors knew (or devised) a more efficient way to test the effect of GRN constraints on DSD prevalence. Nevertheless, these additional simulations would make for a potentially very interesting manuscript.

      Minor comments:

      1. The authors present an analysis correlating conserved non-coding sequence (CNS) composition with gene expression to investigate developmental systems drift. One flaw of this analysis is that it uses deeply conserved sequences as a proxy for the entire cis-regulatory landscape. The authors acknowledge this flaw in the discussion. Another potential flaw is equating the bulk RNA-seq data with a conserved phenotype. In lines 226-227 of the manuscript, it is written that "In line with our computational model, we compared gene expression patterns to measure changes in phenotype." I am not sure if there is an equivalence between the two. In the computational model, the developmental outcome determining fitness is a spatial pattern, i.e., an emergent product of gene expression and cell interactions. In contrast, the RNA-seq data shows bulk measurements in gene expression for different organs. It is conceivable that, despite having very similar bulk measurements, the developmental outcome in response to gene expression (such as a spatial pattern or morphological shape) changes across species. I think this difference should be explicitly addressed in the discussion. The authors may have intended to discuss this in lines 320-326, although it is unclear to me.
      2. Can the authors justify using these six species in the discussion or the results? Are there any limitations with choosing four closely related and two distantly related species for this analysis, in contrast to, say, six distantly related species? If so, please elaborate in the discussion.
      3. In Figure S7, some profiles show no conservation across the six species. Can we be sure that a stabilising selection pressure conserves any CNSs? Is it possible that the deeply conserved CNSs mentioned in the main text are conserved by chance, given the large number of total CNSs? A brief comment on these points in the results or discussion would be helpful.
      4. Line 7-8: I thought this was a bit difficult to read. The connection between (i) evolvability of complex phenotypes, (ii) neutral/beneficial change hindered by deleterious mutations, and (iii) DSD might not be so simple for many readers, so I think it should be rewritten. The abstract was well written, though.
      5. Line 274 vs 276: Is there a difference between regulatory dynamics and regulatory mechanisms?
      6. Figure S4: Do you expect the green/blue lines to approach the orange line in the long term? In some clonal experiments, it seems like it will. In others, it seems like it has plateaued. Under continual DSD, I assume they should converge. It would be interesting to see simulations run sufficiently long to see if this occurs.
      7. Line 27: Evolutionarily instead of evolutionary?
      8. Line 67-68: References in brackets?
      9. Line 144: Capitalise "fig"
      10. Fig. 3C caption: correct "1, 2, 4, 11" (should be 8)
      11. Line 192: Reference repeated
      12. Fig. 5 caption: Capitalise "Supplementary figure"
      13. Line 277: Correct "A previous model Johnson.."
      14. Line 290: Brackets around reference
      15. Line 299: Correct "will be therefore be"
      16. Line 394: Capitalise "table"
      17. Line 449: Correct "was build using"
      18. Fig. 5B: explain the red dashed boxes in the caption
      19. Some of the Figure panels might benefit from further elaboration in their respective captions, such as 3C and 5F.

      Significance

      General Assessment:

      This manuscript tackles a fundamental evolutionary problem of developmental systems drift (DSD). Its primary strength lies in its integrative approach, combining a multiscale evo-devo model with a comparative genomic analysis in angiosperms. This integrative approach provides a new way of investigating how developmental mechanisms can evolve even while the resulting phenotype is conserved. The details of the theoretical model are well defined and succinctly combined across scales. The manuscript employs several techniques to analyse the conservation and divergence of the theoretical model's gene regulatory networks (GRNs), which are rigorous yet easy to grasp. This study provides a strong platform for further integrative approaches to tackle DSD and multicellular evolution.

      The study's main limitations are due to the theoretical model construction and the interpretation of the results. The central claim that DSD occurs extensively through predominantly neutral evolution is not sufficiently supported, as the analysis does not rule out an alternative: DSD is caused by adaptive evolution for increased robustness to developmental or mutational noise. Furthermore, constructing the model with a high-dimensional GRN space and a low-dimensional phenotypic target may create particularly permissive conditions for DSD, raising questions about the generality of the theoretical conclusions. However, these limitations could be resolved by changes to the model and further simulations, although these require extensive research. The genomic analysis uses cis-regulatory elements as a proxy for the entire regulatory landscape, a limitation the authors are aware of and discuss. The genomic analysis uses bulk RNA-seq as a proxy for the developmental outcome, which may not accurately reflect differences in plant phenotypes.

      Advance:

      The concept of DSD is well-established, but mechanistic explorations of its dynamics in complex multicellular models are still relatively rare. This study represents a mechanistic advance by providing a concrete example of how DSD can operate continuously under stabilising selection. I found the evolutionary simulations and subsequent analysis of mechanisms underlying DSD in the theoretical model interesting, and these simulations and analyses open new pathways for studying DSD in theoretical models. To my knowledge, the attempt to directly link the dynamics from such a complex evo-devo model to patterns of regulatory element conservation across a real phylogeny (angiosperms) is novel. However, I think that the manuscript does not have sufficient evidence to show a high prevalence of DSD through neutral evolution in their theoretical model, which would be a highly significant conceptual result. The manuscript does have sufficient evidence to show a high prevalence of DSD through adaptive evolution under stabilising selection, which is a conceptually interesting, albeit somewhat expected, result.

      Audience:

      This work will be of moderate interest to a specialised audience in the fields of evolutionary developmental biology (evo-devo), systems biology, and theoretical/computational biology. Researchers in these areas will be interested in the model and the dynamics of GRN conservation and divergence. The results may interest a broader audience across the fields of evolutionary biology and molecular evolution.

      Expertise:

      My expertise is primarily in theoretical and computational models of biology and biophysics. While I have sufficient background knowledge in bioinformatics to assess the logic of the authors' genomic analysis and its connection to their theoretical model, I do not have sufficient expertise to critically evaluate the technicalities of the bioinformatic methods used for the identification of conserved non-coding sequences (CNSs) or analysis of RNA-seq data. A reviewer with expertise in plant comparative genomics would be better suited to judge the soundness of these specific methods.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, van der Jagt and co-workers present a computational model of the evolution of gene regulatory networks that underpin the development of shoot apical meristems in plants. They find evidence for conservation of a subset of regulatory interactions over many thousands of generations. They also show that after reaching a fitness plateau, the topology of regulatory interactions continues to evolve, giving rise to substantial differences in regulatory networks among cloned populations. Their model suggests that cis-regulatory rewiring is key for developmental evolution, and they reach a similar conclusion after analysing two empirical datasets covering six land plant species. Overall, I find that this study is excellently executed, its methodology sufficiently described, and that its claims are well-supported by the data presented.

      Major comments:

      • Every computational model necessarily makes some simplifying assumptions. It would be nice if the authors could summarise in a paragraph in the Discussion the main assumptions made by their model, and which of those are most worth revisiting in future studies. In the current draft, some assumptions are described in different places in the manuscript, which makes it hard for a non-expert to evaluate the limitations of this model.
      • I did not find any mention of potential energetic constraints or limitations in this model. For example, I would expect high levels of gene expression to incur significant energy costs, resulting in evolutionary trade-offs. Could the authors comment on how taking energy limitations into account might influence their results?

      Minor comments:

      • Figure 3C lists Gene IDs 1, 2, 8, and 11, but the caption refers to genes 1, 2, 4, and 11.

      Significance

      I have to note that my expertise is not in developmental systems drift, but I am generally interested in the evolution of complex phenotypes in response to various environmental pressures. Thus, I do not feel qualified to evaluate the novelty of this work, which I hope other reviewers have done. Nevertheless, I found this study very interesting and the manuscript generally easy to understand. I believe that this study will be of strong interest primarily (but not only) to evolutionary and systems biologists, regardless of the taxonomic group of their research focus.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      # Summary

      On the basis of computational modelling and bioinformatic data analysis, the authors report evidence for Developmental System Drift in the plant apical meristem (a plant stem cell tissue from which other tissues and organs grow, like shoots and roots). The modelling focuses on a general (shoot) apical meristem, the data analysis on the floral meristem. As a non-plant computational biologist, I was lacking some basic plant biology to immediately understand all the technical terms. It hindered a bit, but was not a show-stopper. That said, I interpret their study as follows.

      In the computational modelling part, the authors take into account gene expression, protein complex formation, stochasticity (expression noise), tissue shape, etc. to do evolutionary simulations to obtain a "standard" gene expression pattern known from the shoot apical meristem. Next, they analyze the gene regulatory networks in terms of conserved regulatory interactions. They find two timescales, either interactions quickly turn-over or they are slowly replaced (because under selection). The slowly replaced interactions are important for the realization of the phenotype and their turnover (further explored in a separate set of "neutral evolution" simulations) is called DSD by the authors. The authors state that at the basis of DSD is overlap in gene expression domains, such that genes can take over from each other. Next, the authors analyze two public data sets to show that DSD-associated phenomena such as turn-over of (conserved) noncoding sequences and differences in gene expression patterns occur in plants.

      Considering my limited amount of time and energy, I apologize in advance for stupidities and/or un-elegantly formulated sentences. I'll be happy to discuss with the authors about this work, it was a pleasant summer read!

      Anton Crombach

      Major comments

      • It is system drift, not systems drift (see True and Haag 2001). No 's' after system.
      • I am afraid I have a problem with the manuscript title. I think "Ubiquitoes" is misplaced, because it strongly suggests you have a long list of case studies across plants and animals, and some quantification of DSD in these two kingdoms. That would have been an interesting result, but it is not what you report. I suggest something along the lines of "System drift in the evolution of plant meristem development", similar to the short title used in the footer.
      • Alternatively, the authors may aim to say that DSD happens all over the place in computational models of development? In that case the title should reflect that the claim refers to modeling. (But what then about the data analysis part?)
      • The observation of DSD in the computational models remains rather high-level in the sense that no motifs, mechanisms, subgraphs, mutations or specific dynamics are reported to be associated to it ---with the exception of gene expression domains overlapping. Perhaps the authors feel it is beyond this study, but a Results section with a more in-depth "mechanistic" analysis on what enables DSD would (a) make a better case for the extensive and expensive computational models and (b) would push this paper to a next level. As a starting point, it could be nice to check Ohno's intuition that gene duplications are a creative "force" in evolution. Are they drivers of DSD? Or are TFBS mutations responsible for the majority of cases?
      • Multiple times in the Abstract and Introduction the authors make statements on "cis-regulatory elements" that are actually "conserved non-coding sequences" (CNS). Even if it is not uncommon for CNSs to harbor enhancers etc., I would be very hesitant to use the two as synonyms. As the authors state themselves, sequences, even non-coding, can be conserved for many reasons other than CREs. I would ask the authors to support better their use of "CREs" or adjust language. As roughly stated in their Discussion (lines 310-319), one way forward could be to show for a few CNS that are important in the analysis (of Fig 5), that they have experimentally-verified enhancers. Is that do-able or a bridge too far?

      Minor comments

      Statement of significance:

      • line 7. evo-devo is jargon
      • l9. I would think "using a computational model and data analysis"
      • l13. Strictly speaking you did not look at CREs, but at conserved non-coding sequences.
      • l14. "widespread" is exaggerated here, since you show for a single organ in a handful of plant species. You may extrapolate and argue that you do not see why it should not be widespread, but you did not show it. Or tie in all the known cases that can be found in literature..

      Abstract:

      • l16. "simpler" than what?
      • l27. Again the tension between CREs and non-coding sequence.
      • l28. I don't understand the use of "necessarily" here.

      Introduction:

      • l34-35. A very general biology statement is backed up by two modeling studies. I would have expected also a few based on comparative analyses (e.g., fossils, transcriptomics, etc).
      • l36. I was missing the work on "phenogenetic drift" by Weiss; and Pavlicev & Wagner 2012 on compensatory mutations.
      • l38. Kimura and Wagner never had a developmental process in mind, which is much bigger than a single nucleotide or a single gene, respectively. First paper that I am aware of that explicitly connects DSD to evolution on genotype networks is my own work (Crombach 2016), since the editor of that article (True, of True and Haag 2001) highlighted that point in our communications.
      • l40. While Hunynen and Hogeweg definitely studied the GP map in many of their works, the term goes back to Pere Alberch (1991).
      • l54-55. I'm missing some motivation here. If one wants to look at multicellular structures that display DSD, vulva development in C. elegans and related worms is an "old" and extremely well-studied example. Also, studies on early fly development by Yogi Jaeger and his co-workers are not multicellular, but at least multi-nuclear.
      • Obviously these are animal-based results, so to me it would make sense to make a contrast animal-plant regarding DSD research and take it from there.
      • l66-86. It is a bit of a style-choice, but this is a looong summary of what is to come. I would not have done that. Instead, in the Introduction I would have expected a bit more digging into the concept of DSD, mention some of the old animal cases, perhaps summarize where in plants it should be expected. More context, basically.

      Results:

      • l108. Could you quantify the conserved interactions shared between the populations? Or is each simulation so different that they are pretty much unique?
      • l169. "DSD driving functional divergence" needs some context, since DSD is supposed to not affect function (of the final phenotype). Or am I misunderstanding?
      • l171. You discuss an example here, would it be possible to generalize this analysis and quantify the amount of DSD amongst all cloned populations? And related question: of the many conserved interactions in Fig 4A, how many do the two clonal lineages share? None? All?
      • l176. Say which interaction it is. Is it 0->8, as mentioned in the next paragraph?
      • l190. In the section on DSD in plant gene regulation, the repeated explanation of where the data comes from is a bit tedious to read. You intro it clearly at the start, that is enough.
      • l197. Bulk RNAseq has the problem of averaging gene expression over the population of cells. How do you think that impacts your test for rewiring? If you would do a similar "bulk RNA" style test on your computational models, would you pick up DSD?
      • l202. I do not understand the "within" of a non-coding sequence within an orthogroup. How are non-coding sequences inside an orthogroup of genes?
      • l207-217. This paragraph is difficult to read and would benefit of a rephrasing. Plant-specific jargon, numbers do not add up (line 211), statements are rather implicit (9 deeply conserved CNS are the 3+6? Where do I see them in Fig 5B? And where do I see the lineage-specific losses?).
      • l223. Looking at the shared CNS between SEP1-2, can you find a TF binding site or another property that can be interpreted as regulatory importance?
      • l225. My intuition says that the continuity of the phenotype may not be necessary if its loss can be compensated for somehow by another part of the organism. I.e., DSD within DSD. It is a poorly elaborated thought, I leave it here for your information. Perhaps a Discussion point?
      • l233. "rarely"? I don't see any high Pearson distances.

      • Fig 4. Re-order of panels? I was expecting B at C and vice versa.

      • Fig 5B. Red boxes not explained. Mention that it is an UpSetplot?
      • Fig 5D. It would be nice to quantify the minor and major diffs between orthologs and paralogs.

      Discussion: - l247. Over-generalization. In a specific organ of plants...<br /> - l249. Where exactly is this link between diverse expression patterns and the Schuster dataset made? I suggest the authors to make it more explicit in the Results. - l268. Final sentence of the paragraph left me puzzled. Why talk about opposite function?<br /> - l269. What about phenotypic plasticity due to stochastic gene expression? Does it play a role in DSD in your model? I am thinking about https://pubmed.ncbi.nlm.nih.gov/24884746/ and https://pubmed.ncbi.nlm.nih.gov/21211007/ - l269. What about time scales generated by the system? Looking at Fig 2C and 2D, the elbow pattern is pretty obvious. That means interactions sort themselves into either short-lived or long-lived. Worth mentioning? - l291. Evolution in a constant fitness landscape increases robustness. - l296. My thoughts, for your info: I suspect morphogenesis as single parameters instead of as mechanisms makes for a brittle landscape, resulting in isolated parts of the same phenotype.

      Methods: I have diagonally read through the Methods section, I did not have time to dig in. I hope another reviewer can compensate for me.

      Significance

      Nature and significance of advance

      I find this study a strong contribution to the concept of DSD. It was good to see that colleagues have done the effort of making a convincing case for the presence of DSD in plants. This will be appreciated by the evo-devo community in general. On top of that, the computational modelling work is excellent and sets new standards that will be appreciated by computational colleagues. And I anticipate that the evolutionary biology community welcomes the extension of DSD to the plant kingdom; so far it has been dominated by animal studies.

      I see two limitations: (1) almost no mechanistic explanation of what drives DSD in the simulations. (2) the Abstract, Introduction, etc. need some polishing to be better in line with the results reported.

      Context of existing literature

      Literature is very modeling focused, it could use some empirical support. Also, some literature on DSD is missing: Weiss 2005, Pavlicev 2012, "Older" C. elegans work by the group of Marie-Anne Felix. Probably some more recent empirical case studies have established DSD as well... I may not be aware, as I did not keep track of it.

      What audience?

      In no particular order: plant evolution, plant development, evo-devo, computational biology.

      My field of expertise

      My expertise: gene regulatory networks, evolution, development (in animals), computational modelling, bioinformatic data analysis (single cell omics).

      Phylogenetic tree building is surely not my strength.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all the reviewers for their comments and suggestions, which has helped in revising the manuscript for a broader audience. Some of the experiments that was suggested by the reviewers has been performed and included in the revised manuscript. The response to reviewers is provided below their comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      MprF proteins exist in many bacteria to synthesize aminoacyl phospholipids that have diverse biological functions, e.g. in the defense against small cationic peptides. They integrate two functions, the aminoacylation of lipids, i.e. the transfer of Lys, Arg or Ala from tRNAs to the head group, and the flipping of these modified lipids to the membrane outer leaflet. The authors present structures of MprF from Pseudomonas aeruginosa and describe these structures in great detail. As MprF enzymes confer antibiotic resistance and are therefore highly important, studying them is significant and interesting. Consequently, their structures have been substantially characterized in recent years, including the publication of the dimeric full-length MpfR from Rhizobium (Song et al., 2021).

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      It is true that we have described the experimental details of PaMprF in detail including the constructs. We had reconstructed the map of dimeric PaMprF in 2020 but with the publication of the homologues structures (Song et al 2021 and the unpublished Rhizobium etli structure), we had to make sure the PaMprF dimer is not an artefact. Hence, our attempts to rule out this with different constructs and extensive testing with various detergents. Thus, we would like to keep this in the manuscript. We realise the importance of focusing on novel/interesting parts and have reshuffled sections (comparing structures and validating the dimer interface) followed by description of modelling of lipid molecules.

      Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      The trials with cholesterol hemisuccinate stems more of out of curiosity (we are aware that no cholesterol is present in bacterial membranes). We had started the initial analysis of PaMprF with DDM and by itself it was largely monomeric (unpublished observation and supported by recent publication of PaMprF in DDM – Hankins et al 2025). When we observed that GDN was essential for the stability of the dimer (and not even LMNG), we asked if a combination of CHS with DDM will keep the dimer intact, which didn’t work and GDN was found to be important. The use of CHS for prokaryotic membrane protein studies has now been reported in few different systems and a recent one includes – Caliseki et al., 2025. We would like to keep the observation with CHS in the manuscript, and we have moved this figure to Appendix Fig. S3C.

      In addition, in a recent report on MgtA, a magnesium transporter (Zeinert et al., 2025), it was observed that DDM/LMNG resulted in monomeric enzyme, while GDN resulted in dimeric enzyme albeit, the dimer interface was in the soluble domain. We have added this reference and observation of MgtA in the discussion (page 13, lines 407-411).

      We like to think that the milder GDN tends to keep the membrane proteins or oligomers of membrane proteins more stable but further studies on multiple labile membrane protein systems will be required to substantiate this.

      Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Further major points: - The authors always jump between their structures in detergent and nanodisc during all the descriptions, which makes following the story even more difficult. Please first describe one of the structures and then (briefly) discuss relevant similarities and differences afterwards.

      The flow and description of the structures is now modified and the figures have now been rearranged to make it easier to follow. The panel in figure 2 describing the overlay of the GDN and nanodisc is now moved to Appendix Fig. S2B. Thus, figure 2 has only description of salient features of the structures (the interacting residues between the membrane and soluble domain) and the terminal helix.

      • The difference in dimerization between Pseudomonas and Rhizobium is the most interesting and surprising feature (if true) of the new structures. However, it is not really presented as such. The authors should put more emphasis on making clear that this is a complete rotation of the monomers with respect to each other (by how many degrees?) and they should visualize it even more clearly in Figure 4 (and label the figure so that it is possible to understand it without having to read the text or the legend first).

      We thought the colouring of the TM helices should make the difference in interface more obvious (the N and C-terminal TM helices in different colours). Now, we have also labelled the TM helices, so that it is easier to follow (this was also shown in panel E). The rotation is ~180° and this is now mentioned in the figure legend.

      • P. 10: The authors insinuate that only one of the dimer interfaces, either Pseudomonas or Rhizobium could be real, but disregard the possibility that both might be the biologically relevant interfaces of the respective species and that there might have been a switch of interfaces during evolution. They should also mention and discuss this possibility.

      We didn’t imply that one of the interfaces is real but clearly mentioned that it could also be different conformational state (page 7, lines 226-228). In the revised version, we have included a multiple sequence alignment (we had not included in the initial draft as it had been presented in several previous publications). The MSA (Appendix Fig. S6) reveals that neither of the interfaces are highly conserved.

      • Fig. 5G: The authors claim that the higher molecular band that appears in the mutant is a "dimer with aberrant migration" of >250 kDa as opposed to the expected 150 kDa. They should explain how they came to this conclusion and how they can be sure that the band does not correspond to a higher oligomer (trimer or tetramer). They could show, by extraction and purification scheme similar to the wildtype using first LMNG and then GDN, followed by at least a preliminary EM analysis, that the crosslinked mutant MprF is indeed a dimer, or use other biophysical methods to do the same, otherwise this experiment does not show much. Furthermore, they should also include a cysteine mutant in the part of Pseudomonas MprF that would be involved in a Rhizobium-like interface in their crosslinking experiments to check whether they could also stabilize dimers in this case.

      The band of the double mutant after crosslinking (or even without crosslinking) migrates at higher molecular weight than that expected for a dimer, and could potentially be a higher molecular band that a dimer. We also note that in the previous publication by Song et al 2021, the crosslinking of RtMprF also resulted in a higher molecular weight band (shown also by Western blot).

      We now substantiate the dimer of PaMprF with different approaches. We employed blue-native gel and also SDS-PAGE of the purified protein. This clearly shows that the higher molecular band after crosslinking is a dimer (Figure 4B and Fig. EV4D). In particular, in the BN-PAGE, the treatment of mutants with crosslinkers revealed a dimeric band even in the presence of SDS. Further, we have performed cryoEM analysis of the mutants - H386C/F389C and H566C. The images, classes and reconstruction show that the enzyme forms a dimer similar to the WT. Interestingly, we also observe in H566C mutant in nanodisc, a small population that has similar architecture to the Rhizobium-like interface (classes shown in Fig. EV7 and Appendix Fig. S5). This prompted us to look closely at other datasets and it is clear that during the process of reconstitution in nanodisc, we observe both kinds of dimer interface but the PaMprF dimer is predominant. We also observe higher order oligomers (tetramer) in GDN but as only few views are visible, a reconstruction could not be obtained (Appendix Fig. S5). In addition, we also introduced two cysteines on the Rhizobium-like interface and no crosslinking on the membranes were observed (Figure 4B). But it is possible that these chosen mutants are not accessible to the crosslinker. Thus, we conclude that the oligomers of PaMprF is sensitive to nature of detergents and labile.

      • As the question whether the observed interface is real or an artefact is very central to the value of the structural data and the drawn conclusions from it, the authors should make more effort to analyze and try to validate the interface. First, an analysis of interface properties (buried surface area, nature of the interactions, conservation) should be performed for the interface as observed in the Pseudomonas structure but also for a (hypothetical) Rhizobium-like interface of two Pseudomonas monomers (such a model of a dimer should be easily obtainable by AlphaFold using the available Rhizobium structures as models). Then, experimental methods such as FRET or crosslinking-MS would allow to draw more solid conclusions on the distances between potential interface residues. While these experiments are a certain effort, the question whether the dimer interface is real is so central to the paper that it would be worthwhile to make this effort.

      We have included the interface area and nature of interactions in the revised manuscript (page 7, lines 221-223).

      We attempted AlphaFold for predicting the dimeric structure of PaMprF (and included RtMprF also). Some of the attempts from the predictions is summarised in figure 1.

      The prediction of monomer is of high confidence but the oligomer (here dimer) is of low confidence (from ipTM values). Even the prediction for Rhizobium enzyme has low confidence, and gives a complete different architecture (and in some trials with lipids, it gives an inverted or non-physiological dimer). Only when the monomer of PaMprF with lipids and tRNA was given as input (requested by reviewer 2 and described below), it predicts oligomeric structure with some confidence but rest were not informative.

      • As it seems that detergents might disrupt or modify the dimer interface, it might be an alternative to solubilize the protein in a more native environment by polymer-stabilized nanodiscs using DIBMA or similar molecules.

      We have tried to use SMALPs for extraction of PaMprF. We were able to solubilise but unable to enrich the enzyme sufficient for structural studies currently and will require further optimisation.

      • Since parts of the Discussion are mostly repetitions of the Results part and other parts of the Discussion also contain a large extend of structure analysis one would usually rather expect in the Results part instead of the Discussion, the authors should consider condensing both to a combined (and overall much shorter) Results & Discussion section.

      We have rewritten much of the discussion section and removed any repetition from the results sections. We would prefer to keep the results and discussion separate.

      Minor points: - Explain abbreviations the first time they appear in the text, e.g. TTH

      This is now expanded in the first instance

      • Figure labels are very minimalistic. This should be improved, e.g. by putting labels to important structural features that appear in the text, otherwise the figures are not an adequate support for the text.

      The font size for the labels have been increased.

      • Figure 5: Label where the different oligomers run on the gels

      Labelled.

      Reviewer #1 (Significance (Required)):

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Shaileshanand J. et al., reported the structures of Multiple Peptide Resistance Factor, MprF, which is a bi-functional enzyme in bacteria responsible for aminoacylation of lipid head groups. The authors purified MprF from Pseudomonas aeruginosa in GDN micelles and nanodiscs, and by applying cryo-EM single particle method, they successfully reached near-atomic resolution, and built corresponding atomic models. By applying structural analysis as well as biochemistry methods, the authors demonstrated dimeric formation of MprF, exhibited the dynamic nature of the catalytic domain of this enzyme, and proposed a possible model on tRNA binding and aminoacylation.

      Major comments 1. In abstract, the authors stated 'Several lipid-like densities are observed in the cryoEM maps, which might indicate the path taken by the lipids and the coupling function of the two functional domains. Thus, the structure of a well characterised PaMprF lays a platform for understanding the mechanism of amino acid transfer to a lipid head group and subsequent flipping across the leaflet that changes the property of the membrane.' Firstly, those lipid-like densities were demonstrated in Fig 3A, since densities of lipids of purified membrane proteins often exist within regions of relatively low local resolution, or low quality, I think more detailed description on how the authors defined which part of the density belongs to lipid and how they acquired the modeling of some of the lipids is required. And the authors modeled phosphatidylglycerol into the GDN MprF, I would require additional experiment, for instance, mass spectrometry over the purified sample, to demonstrate the existence of this specific lipid with the sample. Secondly, regarding the last sentence in the abstract, how these structures lay a platform for further understanding was poorly discussed in both result section and discussion section, since the authors clearly stated 'This cavity perhaps provides a path for holding lipids...', then the statement in the next sentence 'Taken together... the vicinity to the cavities described above indicates the possible path taken by the lipids to enter and exit the enzyme' does not have a reliable evidence to support this conclusion, I would suggest the authors move these statements into discussion section, and elaborate more over this issue since it is an important part in the abstract, or make a more solid proof using other approaches, such as molecular dynamics simulation, to make these statements solid in the result section.

      The membranes of E. coli have predominantly phosphatidyl ethanolamine (PE) and phosphatidyl glycerol (PG) as the next abundant lipid with cardiolipin though smaller in number, plays an important role in functioning of many membrane proteins. In our map, the non-protein density are unambiguous and they can be observed as long density reflective of acyl chains (note that GDN used in purification has no acyl chain) and hence attributed these densities to lipids (Fig. EV4E/F and Figure 5A). Only in few of these densities, head group could be modelled and the identity of the lipid as PG at the dimer interface is based on the requirement of negatively charged lipids for oligomerisation of membrane proteins in general (for example – KcsA tetramer formation requires PG, Marius et al., 2005; Valiyaveetil et al., 2002;2004). It is true that the lipid densities are at the peripheral regions of the map but here only acyl chains have been modelled. Within the membrane domain, one reasonably ordered lipid is observed and by analogy with R. tropici structure, it is possible to build a modified-PG (in PaMprF here ala-PG). However, the density of the head group is not unambiguous (unlike lysine in the R. tropici, whose density stands out) and hence we have modelled it as PG alone. In the methods (page 20, lines 649-650), the identification and modelling of lipid densities is described.

      We agree that mass spectrometry analysis of purified lipids will be useful but it will not be able to tell the position of the lipid in the map (model) and for this we still require a map at higher resolution with better ordered lipids. We have recently built/developed the workflow for native MS and we plan to initiate analysis of PaMprF in the near future, which will provide details for the lipid purified with the enzyme.

      We had initiated molecular dynamics simulation during the review process, and we had included tRNA molecules (shorter version) as we felt the connection between tRNA binding and lipid modification was important. This would have also explained the path taken by lipids (performed by Hankins et al., 2025 in their publication). However, this is likely to require more work (and computing resources) and both mass spectrometry and molecular dynamics will be part of the future work.

      We have rewritten the discussion and changed the last line of the abstract to the following

      “From the structures, the binding modes of tRNA and lipid transport can be postulated and the mobile secondary structural elements in the synthase domain might play a mechanistic role”.

      (in the abstract, lines 24-26).

      Fig 2B, it seems the H566 sidechains were overlapping in the zoom-in figure of distance measurement between H566 residues, to clarify this, authors should either present another figure with rotation, to better demonstrate their relative locations, or swap this zoom-in figure with another figure with rotations. Also, could the authors briefly commenting on why they chose H566 for distance measurement specifically?

      The side chain of residue H566 in the nanodisc model face towards each other at the interface, hence this residue was chosen to shown the proximity.

      Related to previous comment, I see one additional green square in Fig. 2A and an additional green square in Fig. 2B, without any zoom-in images provided on these regions. Besides, they're focusing on two different domains with same color, any particular reason why they're there? If so, please provide the information in figure legends.

      The green squares in panels 2A and 2B are the regions that have been zoomed in panels 2D and 2E showing the interactions of the TTH. This is now made clear in the legend as well as in the figure.

      Related to previous comment, authors should also provide distance measurement over electrostatic interaction sites in Fig. 2A, since distance plays as an important factor in these forces.

      The electrostatic interactions have been included.

      For Fig. 2C, since in Fig. 1, the authors have already indicated the differences between reconstruction of the GDN and nanodisc datasets, this information provided here seems to be a bit abundant, I suggest either move this panel to Fig. 1, to make a visualization on both electron densities as well as atomic models, or move this panel to supplementary figures.

      We thank the reviewer for the suggestion. The panel, figure 2C is moved to Appendix Fig. S2B.

      Fig. 3B, some of the spheres of the lipids were also marked as red, any particular reason why they're red? Do they indicate they're phosphate heads? If so, could the authors provide evidences how they define these orientations of the lipid heads? If not, any particular reason why they're red?

      Although, there are non-protein densities (i.e., density beyond noise that remain after modelling of protein residues and found individually) have been modelled as lipids (In Fig. EV4E, these additional densities are shown). Except for few, all these densities have been modelled only as acyl chain. The lipids modelled with head group and phosphate (that have oxygen) and the fit of the density are shown in both figure 3A and EV4F. Hence, the red (oxygen) is seen in the space filling model of lipids (the density for few lipids are shown, also in the response to the comment below).

      Fig. 3C, the fitted model of lipid and its corresponding density should be added to Fig. S4, to give more detailed view on the quality of the fitting.

      The figure 3 has now been reorganised and the new figure (fig. 5) has only 3 panels. We have provided an enlarged view of the lipids in the membrane domain along with unmodelled densities in 3A. In addition, in fig. EV4F, fit of the lipid to density (select lipids) are shown.

      Fig. 4D and 4E, could the authors also indicate the RMSD values when comparing the differences of RtMprF, PaMprF, ReMprF, this information would be helpful to understand how big of a difference within these three models.

      The RMSD values of the structural comparison is given in the text.

      Fig. 6E, the coloring used for CCA-Ala were similar to the blue part of soluble domain, could the authors change the coloring a bit? Also, for Fig. 6F, I would suggest the authors provide a prediction model, such as using AlphaFold3, of this tRNA interaction site, to further validate this proposed model.

      The colour of the CCA part is changed in the revised figure. Following the suggestion of the reviewer, we used AlphaFold3 to predict the complex formation of PaMprF with tRNA (or shorter version) (Figure 2). As mentioned above in response to reviewer 1, the prediction of dimeric enzyme was of low confidence and this is also reflected when a combination of tRNA, lipids and enzyme sequence are given. Instead of full-length tRNA, if only the CCA end is provided, then the prediction program does position this in the postulated cavity. Only with the monomeric enzyme and tRNA does one get a reasonable model. With respect to the proposed model in 6F, currently we don’t have any evidence and this remains a postulate. In the revised manuscript, we have replaced this with conservation figure, which we thought is more relevant.

      In Supplementary Figures S1 and S3, the angular distribution of maps exhibited preferred orientation to certain extent, 3D FSC estimation should also be supplied for these maps, as an indication of whether the reconstructed densities were affected or not.

      We have included the 3DFSC plots for all the data sets (including the new ones in figures EV1, 2, 5, 6, 7). It is evident that the nanodisc datasets in general are slightly anisotropic.

      For Fig S3B, could the authors switch to another image with better contrast?

      This is now replaced with an image to show the particles.

      Minor comments 1. Fig. 2E and 2F, distance measurement should also be supplied to these two panels.

      We have now included the distance measurement in both the panels, which are now Fig. 2D and 2E.

      Fig. 5D, since in Fig. 4F and 4G already mentioned the skeleton of GDN, this modeling part should be presented before exhibit it in dimer interface, the authors should rearrange the sequence over these three panels.

      The figures in the revised manuscript has been rearranged. Figure 5 (now figure 4) has been modified to include the biochemical analysis (crosslinking studies) and the panel 5D has been removed.

      In Supplementary Figure S3, which density was shown for the PaMprF local resolution estimation result? Authors should provide this information as two maps were shown in this figure.

      The local resolution is for C2 symmetrised map and this is now mentioned in the panel.

      CROSS-REFEREE COMMENTS Both Reviewer #1 and #3 made comments over technical issue, their evaluation over functional aspects of this protein is what I was lacking over my comments, also, their evaluation of the biological narrative, relevance toward previous research is also more insightful. Finally, they offer valuable suggestions on how to adjust the article to make it more readable, and better describing the biological story which I would suggest the authors to pay attention to.

      Reviewer #2 (Significance (Required)):

      Significance The authors mainly focused on the structure of MprF in Pseudomonas aeruginosa, this protein is essential for the resistance to cationic antimicrobial peptides. A combination of structural and biochemical analysis provided evidences to the dimeric formation to this enzyme, and the analysis over differences of purified proteins using GDN and nanodisc was particular interesting, which provide new insight regarding the flexible nature of this enzyme, and potentially could be beneficial to the membrane protein community, as it demonstrates the differences in detergent/nanodisc of choice could affect the assembly of the protein of interest. Still, some of the statements in the manuscript, for instance, the assignment of lipids was over-claimed and could be benefited from additional approaches to support the issue. I would suggest some refinement in the discussion section as well as some of the figures.

      My expertise: cryo-EM single particle analysis; cryo-ET; sub-tomo averaging; cryo-FIB;

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Jha and Vinothkumar characterize the cryoEM structure of the alanyl-phosphatidylglycerol producing multiple peptide resistance factor (MprF) of Pseudomonas aeruginosa. MprF proteins mediate the transfer of amino acids from aminoacyl-tRNAs to negatively charged phospholipids resulting in reduced membrane interactions with cationic antimicrobial peptides (produced by the host and competing microorganisms). The phospholipid modifications involve in most cases the transfer of lysine or alanine to phosphatidylglycerol. MprF proteins are membrane proteins consisting of a soluble and hydrophobic domain. Multiple functional studies have shown that the soluble domain of MprF mediates the aminoacylation of phosphatidylglycerol, while the hydrophobic domain mediates the "flipping" of aminoacylated phospholipids across the membrane, a process that is crucial to repulse or prevent the interaction of antimicrobial peptides encountered at the outer leaflet of bacterial membranes. Aside from its role in conferring antimicrobial peptide resistance, other roles of MprF have been described including more physiological roles such as improving growth under acidic conditions. Interestingly, MprF proteins are also found in Gram-negative bacteria which are already protected by an additional membrane that includes LPS. However, in Pseudomonas aeruginosa, MprF confers phenotypes that are similar to those observed in Gram-positive bacteria. Importantly, crystal structures of the soluble domain have led to important insights into aminoacyl phospholipid synthesis and recent studies on the cryoEM structure of Rhizobium tropici have confirmed functional and preliminary structural studies with other MprF proteins. The cryoEM structure from R. tropici confirmed the dimeric structure of MprF and supported a role of the hydrophobic domain in flipping lysyl-phosphatidylglycerol across the membrane. A comparison of the structures of lysyl-phosphatidylglycerol with alanyl-phosphatidylglycerol producing MprFs could reveal new insights into the mechanism of transferring aminoacyl-phospholipids from the soluble domain to the hydrophobic domain and translocation of alanyl- vs lysyl-phosphatidylglycerol across the membrane.

      Major concerns

      1. The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane. The reader is left with the finding of a distinct architecture with no further explanation or hypothesis.

      We thank the reviewer for his/her comments. It is true that the crystal structures of soluble domains of MprF (from 3 species) and the cryoEM structures are now available (two Rhizobium species). However, the cryoEM maps that we have obtained has several salient features including the distinct dimeric interface and the position of the C-terminal helix of the soluble domain. This in particular is important. In the previous study, Hebecker et al 2011 had reported that the terminal helix of PaMprF was important for the activity and the construct without the TM domain can also function in modifying the lipids. The full-length cryoEM map of PaMprF in GDN now provides an idea how this occurs, with the terminal helix buried at the interface. Further, the proposed tRNA binding site (from Hebecker et al 2015, lysine amide bound structure) face other in the dimeric architecture of R. tropici and it is not clear how the full-length tRNA will bind without disrupting the dimer. In contrast, the dimer architecture observed for PaMprF has the tRNA binding site facing away and they can bind to the enzyme without any constraints. We think the mobile/dynamic elements (or secondary structure) of the synthase domain play a major role in interaction with substrates and mechanism. The current structures provide some evidence for this and form the basis of future studies. Instead of cartoon description, we have now included a conservation plot of the molecule in explaining the possible mechanism along with the surface representation in figure 6.

      Differences to R.tropici MprF and other studies are difficult to follow as only a topological map of the Pseudomonas MprF is provided and conserved amino acids that have been shown to be crucial in mediating synthesis and flipping are not highlighted in the text or in the figures, specifically addressed, or discussed. Conserved amino acids in the presented cryoEM structure could provide important mechanistic insights and could address substrate specificity/requirements for aminoacyl phospholipid synthesis, transfer to the hydrophobic domain and flipping.

      The conservation of residues across MprF homologues have been presented in previous published articles and hence, initially we had not included in the manuscript. We have now included multiple sequence alignment of select homologues of MprF highlighting conserved residues (Appendix Fig. S6) as well a figure (Fig. 6F) colouring the molecule with conservation scores with CONSURF. In figure 6F, zoomed in version, we highlight the many of the conserved residues in the synthase domain as they play a role in substrate selectivity.

      Authors characterize an alanyl-phosphatidylglycerol producing MprF but do not detect the lipid in the cryoEM structure. Thus, the potential path taken by alanyl-phosphatidylglycerol remains unclear. Authors model the detected lipids as phosphatidylglycerol, which may be an interesting finding as it would indicate that MprF is generally capable of flipping phospholipids (this is however not discussed). While it is plausible that MprF flippases may be able to flip phosphatidyglycerol it could have a different path and structural requirements. It is also difficult to follow what the suggested pathway of flipping is in the Pseudomonas-MprF flippase (compared to R.tropici). Authors could provide a similar overview figure as in Song et al. and indicate what the potential differences are.

      We modelled phosphatidylglycerol as the lipid as the current density doesn’t allow to model ala-PG ambiguously though it is found in the same position as the lys-PG in the R. tropici maps. The recent in-vitro assay by Hankins et al 2025 shows that PaMprF is able to flip wide range of lipids and we would also like to point out that PG from outer leaflet can be flipped, whose headgroup can be modified at the inner leaflet and flipped back. As shown by Song et al 2021 and Hebecker et al 2011, the specificity for the substrates is in the synthase domain (by mutagenesis and swapping). We don’t think there will be any difference between the lys-PG and Ala-PG path but in our opinion the positional relation between the soluble and membrane domain is the most important and has remained the focus of the manuscript along with the dimeric architecture. The figure 6 in the manuscript is descriptive of this and provides a summary of the structural observation from the presented structures.

      Minor concerns

      • Page 13: the following sentence should be rephrased: "Among the missing links in the current cryoEM maps is the lack of well-ordered density for lipid molecules on the inner leaflet closer to the re-entrant helices but it is reasonable to assume from the cluster of positive charge that there will be lipid molecules and are dynamic. "

      This is has been rephrased.

      • Page 4: Klein et al do not show that the Pseudomonas aeruginosa MprF mediates flipping

      Corrected to reflect only the modification of lipid and not flipping.

      Reviewer #3 (Significance (Required)):

      General assessment: see review

      Advance: Minor

      Audience: Specialized

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Jha and Vinothkumar characterize the cryoEM structure of the alanyl-phosphatidylglycerol producing multiple peptide resistance factor (MprF) of Pseudomonas aeruginosa. MprF proteins mediate the transfer of amino acids from aminoacyl-tRNAs to negatively charged phospholipids resulting in reduced membrane interactions with cationic antimicrobial peptides (produced by the host and competing microorganisms). The phospholipid modifications involve in most cases the transfer of lysine or alanine to phosphatidylglycerol. MprF proteins are membrane proteins consisting of a soluble and hydrophobic domain. Multiple functional studies have shown that the soluble domain of MprF mediates the aminoacylation of phosphatidylglycerol, while the hydrophobic domain mediates the "flipping" of aminoacylated phospholipids across the membrane, a process that is crucial to repulse or prevent the interaction of antimicrobial peptides encountered at the outer leaflet of bacterial membranes. Aside from its role in conferring antimicrobial peptide resistance, other roles of MprF have been described including more physiological roles such as improving growth under acidic conditions. Interestingly, MprF proteins are also found in Gram-negative bacteria which are already protected by an additional membrane that includes LPS. However, in Pseudomonas aeruginosa, MprF confers phenotypes that are similar to those observed in Gram-positive bacteria. Importantly, crystal structures of the soluble domain have led to important insights into aminoacyl phospholipid synthesis and recent studies on the cryoEM structure of Rhizobium tropici have confirmed functional and preliminary structural studies with other MprF proteins. The cryoEM structure from R. tropici confirmed the dimeric structure of MprF and supported a role of the hydrophobic domain in flipping lysyl-phosphatidylglycerol across the membrane. A comparison of the structures of lysyl-phosphatidylglycerol with alanyl-phosphatidylglycerol producing MprFs could reveal new insights into the mechanism of transferring aminoacyl-phospholipids from the soluble domain to the hydrophobic domain and translocation of alanyl- vs lysyl-phosphatidylglycerol across the membrane.

      Major concerns:

      1. The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane. The reader is left with the finding of a distinct architecture with no further explanation or hypothesis.

      2. Differences to R.tropici MprF and other studies are difficult to follow as only a topological map of the Pseudomonas MprF is provided and conserved amino acids that have been shown to be crucial in mediating synthesis and flipping are not highlighted in the text or in the figures, specifically addressed, or discussed. Conserved amino acids in the presented cryoEM structure could provide important mechanistic insights and could address substrate specificity/requirements for aminoacyl phospholipid synthesis, transfer to the hydrophobic domain and flipping.

      3. Authors characterize an alanyl-phosphatidylglycerol producing MprF but do not detect the lipid in the cryoEM structure. Thus, the potential path taken by alanyl-phosphatidylglycerol remains unclear. Authors model the detected lipids as phosphatidylglycerol, which may be an interesting finding as it would indicate that MprF is generally capable of flipping phospholipids (this is however not discussed). While it is plausible that MprF flippases may be able to flip phosphatidyglycerol it could have a different path and structural requirements. It is also difficult to follow what the suggested pathway of flipping is in the Pseudomonas-MprF flippase (compared to R.tropici). Authors could provide a similar overview figure as in Song et al. and indicate what the potential differences are.

      Minor concerns:

      1. Page 13: the following sentence should be rephrased: "Among the missing links in the current cryoEM maps is the lack of well-ordered density for lipid molecules on the inner leaflet closer to the re-entrant helices but it is reasonable to assume from the cluster of positive charge that there will be lipid molecules and are dynamic. "

      2. Page 4: Klein et al do not show that the Pseudomonas aeruginosa MprF mediates flipping

      Significance

      General assessment:

      The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane

      Advance: Minor

      Audience: Specialized

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Shaileshanand J. et al., reported the structures of Multiple Peptide Resistance Factor, MprF, which is a bi-functional enzyme in bacteria responsible for aminoacylation of lipid head groups. The authors purified MprF from Pseudomonas aeruginosa in GDN micelles and nanodiscs, and by applying cryo-EM single particle method, they successfully reached near-atomic resolution, and built corresponding atomic models. By applying structural analysis as well as biochemistry methods, the authors demonstrated dimeric formation of MprF, exhibited the dynamic nature of the catalytic domain of this enzyme, and proposed a possible model on tRNA binding and aminoacylation.

      Major comments:

      1. In abstract, the authors stated 'Several lipid-like densities are observed in the cryoEM maps, which might indicate the path taken by the lipids and the coupling function of the two functional domains. Thus, the structure of a well characterised PaMprF lays a platform for understanding the mechanism of amino acid transfer to a lipid head group and subsequent flipping across the leaflet that changes the property of the membrane.' Firstly, those lipid-like densities were demonstrated in Fig 3A, since densities of lipids of purified membrane proteins often exist within regions of relatively low local resolution, or low quality, I think more detailed description on how the authors defined which part of the density belongs to lipid and how they acquired the modeling of some of the lipids is required. And the authors modeled phosphatidylglycerol into the GDN MprF, I would require additional experiment, for instance, mass spectrometry over the purified sample, to demonstrate the existence of this specific lipid with the sample. Secondly, regarding the last sentence in the abstract, how these structures lay a platform for further understanding was poorly discussed in both result section and discussion section, since the authors clearly stated 'This cavity perhaps provides a path for holding lipids...', then the statement in the next sentence 'Taken together... the vicinity to the cavities described above indicates the possible path taken by the lipids to enter and exit the enzyme' does not have a reliable evidence to support this conclusion, I would suggest the authors move these statements into discussion section, and elaborate more over this issue since it is an important part in the abstract, or make a more solid proof using other approaches, such as molecular dynamics simulation, to make these statements solid in the result section.

      2. Fig 2B, it seems the H566 sidechains were overlapping in the zoom-in figure of distance measurement between H566 residues, to clarify this, authors should either present another figure with rotation, to better demonstrate their relative locations, or swap this zoom-in figure with another figure with rotations. Also, could the authors briefly commenting on why they chose H566 for distance measurement specifically?

      3. Related to previous comment, I see one additional green square in Fig. 2A and an additional green square in Fig. 2B, without any zoom-in images provided on these regions. Besides, they're focusing on two different domains with same color, any particular reason why they're there? If so, please provide the information in figure legends.

      4. Related to previous comment, authors should also provide distance measurement over electrostatic interaction sites in Fig. 2A, since distance plays as an important factor in these forces.

      5. For Fig. 2C, since in Fig. 1, the authors have already indicated the differences between reconstruction of the GDN and nanodisc datasets, this information provided here seems to be a bit abundant, I suggest either move this panel to Fig. 1, to make a visualization on both electron densities as well as atomic models, or move this panel to supplementary figures.

      6. Fig. 3B, some of the spheres of the lipids were also marked as red, any particular reason why they're red? Do they indicate they're phosphate heads? If so, could the authors provide evidences how they define these orientations of the lipid heads? If not, any particular reason why they're red?

      7. Fig. 3C, the fitted model of lipid and its corresponding density should be added to Fig. S4, to give more detailed view on the quality of the fitting.

      8. Fig. 4D and 4E, could the authors also indicate the RMSD values when comparing the differences of RtMprF, PaMprF, ReMprF, this information would be helpful to understand how big of a difference within these three models.

      9. Fig. 6E, the coloring used for CCA-Ala were similar to the blue part of soluble domain, could the authors change the coloring a bit? Also, for Fig. 6F, I would suggest the authors provide a prediction model, such as using AlphaFold3, of this tRNA interaction site, to further validate this proposed model.

      10. In Supplementary Figures S1 and S3, the angular distribution of maps exhibited preferred orientation to certain extent, 3D FSC estimation should also be supplied for these maps, as an indication of whether the reconstructed densities were affected or not.

      11. For Fig S3B, could the authors switch to another image with better contrast?

      Minor comments:

      1. Fig. 2E and 2F, distance measurement should also be supplied to these two panels.

      2. Fig. 5D, since in Fig. 4F and 4G already mentioned the skeleton of GDN, this modeling part should be presented before exhibit it in dimer interface, the authors should rearrange the sequence over these three panels.

      3. In Supplementary Figure S3, which density was shown for the PaMprF local resolution estimation result? Authors should provide this information as two maps were shown in this figure.

      CROSS-REFEREE COMMENTS

      Both Reviewer #1 and #3 made comments over technical issue, their evaluation over functional aspects of this protein is what I was lacking over my comments, also, their evaluation of the biological narrative, relevance toward previous research is also more insightful. Finally, they offer valuable suggestions on how to adjust the article to make it more readable, and better describing the biological story which I would suggest the authors to pay attention to.

      Significance

      Significance

      The authors mainly focused on the structure of MprF in Pseudomonas aeruginosa, this protein is essential for the resistance to cationic antimicrobial peptides. A combination of structural and biochemical analysis provided evidences to the dimeric formation to this enzyme, and the analysis over differences of purified proteins using GDN and nanodisc was particular interesting, which provide new insight regarding the flexible nature of this enzyme, and potentially could be beneficial to the membrane protein community, as it demonstrates the differences in detergent/nanodisc of choice could affect the assembly of the protein of interest. Still, some of the statements in the manuscript, for instance, the assignment of lipids was over-claimed and could be benefited from additional approaches to support the issue. I would suggest some refinement in the discussion section as well as some of the figures.

      My expertise: cryo-EM single particle analysis; cryo-ET; sub-tomo averaging; cryo-FIB;

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      MprF proteins exist in many bacteria to synthesize aminoacyl phospholipids that have diverse biological functions, e.g. in the defense against small cationic peptides. They integrate two functions, the aminoacylation of lipids, i.e. the transfer of Lys, Arg or Ala from tRNAs to the head group, and the flipping of these modified lipids to the membrane outer leaflet. The authors present structures of MprF from Pseudomonas aeruginosa and describe these structures in great detail. As MprF enzymes confer antibiotic resistance and are therefore highly important, studying them is significant and interesting. Consequently, their structures have been substantially characterized in recent years, including the publication of the dimeric full-length MpfR from Rhizobium (Song et al., 2021).

      • While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      • Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      • Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Major points:

      • The authors always jump between their structures in detergent and nanodisc during all the descriptions, which makes following the story even more difficult. Please first describe one of the structures and then (briefly) discuss relevant similarities and differences afterwards.

      • The difference in dimerization between Pseudomonas and Rhizobium is the most interesting and surprising feature (if true) of the new structures. However, it is not really presented as such. The authors should put more emphasis on making clear that this is a complete rotation of the monomers with respect to each other (by how many degrees?) and they should visualize it even more clearly in Figure 4 (and label the figure so that it is possible to understand it without having to read the text or the legend first).

      • P. 10: The authors insinuate that only one of the dimer interfaces, either Pseudomonas or Rhizobium could be real, but disregard the possibility that both might be the biologically relevant interfaces of the respective species and that there might have been a switch of interfaces during evolution. They should also mention and discuss this possibility.

      • Fig. 5G: The authors claim that the higher molecular band that appears in the mutant is a "dimer with aberrant migration" of >250 kDa as opposed to the expected 150 kDa. They should explain how they came to this conclusion and how they can be sure that the band does not correspond to a higher oligomer (trimer or tetramer). They could show, by extraction and purification scheme similar to the wildtype using first LMNG and then GDN, followed by at least a preliminary EM analysis, that the crosslinked mutant MprF is indeed a dimer, or use other biophysical methods to do the same, otherwise this experiment does not show much. Furthermore, they should also include a cysteine mutant in the part of Pseudomonas MprF that would be involved in a Rhizobium-like interface in their crosslinking experiments to check whether they could also stabilize dimers in this case.

      • As the question whether the observed interface is real or an artefact is very central to the value of the structural data and the drawn conclusions from it, the authors should make more effort to analyze and try to validate the interface. First, an analysis of interface properties (buried surface area, nature of the interactions, conservation) should be performed for the interface as observed in the Pseudomonas structure but also for a (hypothetical) Rhizobium-like interface of two Pseudomonas monomers (such a model of a dimer should be easily obtainable by AlphaFold using the available Rhizobium structures as models). Then, experimental methods such as FRET or crosslinking-MS would allow to draw more solid conclusions on the distances between potential interface residues. While these experiments are a certain effort, the question whether the dimer interface is real is so central to the paper that it would be worthwhile to make this effort.

      • As it seems that detergents might disrupt or modify the dimer interface, it might be an alternative to solubilize the protein in a more native environment by polymer-stabilized nanodiscs using DIBMA or similar molecules.

      • Since parts of the Discussion are mostly repetitions of the Results part and other parts of the Discussion also contain a large extend of structure analysis one would usually rather expect in the Results part instead of the Discussion, the authors should consider condensing both to a combined (and overall much shorter) Results & Discussion section.

      Minor points:

      • Explain abbreviations the first time they appear in the text, e.g. TTH

      • Figure labels are very minimalistic. This should be improved, e.g. by putting labels to important structural features that appear in the text, otherwise the figures are not an adequate support for the text.

      • Figure 5: Label where the different oligomers run on the gels

      Significance

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      • Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      • Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03130

      Corresponding author(s): Ellie S. Heckscher

      [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      We thank all three reviewers for their feedback on the paper. Reviewers stated that the paper was of broad interest to developmental biologists and neurobiologists. However, we want to ensure that our two key conceptual contributions are clear. We clarify in the following paragraph and include a revised abstract. We will update the introduction and paper to better reflect these advances. We also attach a supplemental table 1, which was inadvertently omitted from the previous submission due to our error.

      The first advance is that serially homologous neuroblasts follow a multimodal production model: In principle, stem cells can divide any number of times, from once to throughout the entire lifetime of the animal. And, on each division, a stem cell can generate either a proliferative daughter cell or a post-mitotic neuron. Together, therefore, there is a vast potential number of neurons any given stem cell could produce. From the literature on the vertebrate neocortex, we had the following models: (1) "random production" model, in which any number of neurons could be made by a stem cell; or (2) "unitary production" model, in which the same number of neurons (~eight) is produced by a stem cell regardless of context. Our data revealed an entirely new "multi-modal production" model, which could not have been predicted by prior literature. In the context of serially homologous neuroblasts arrayed along the Drosophila larval body axis, sets of five to seven neurons are produced in increments of one, two, or four. These increments correspond to units called temporal cohorts. Temporal cohorts are lineage fragments, or small set of neurons that share synaptic partners, making them lineage-based units of circuit assembly. Thus, in a multimodal production model, serially homologous stem cells produce different numbers of temporal cohorts depending on location. Our data advance the field by showing that stem cells produce circuit-relevant sets of neurons by adding or omitting temporal cohorts from a region, to meet regional needs.

      Key to understanding the second advance is that there are multiple types of temporal cohorts: early-born Notch OFF, early-born Notch ON, late-born Notch OFF, and late-born Notch ON. One temporal cohort type, the early-born Notch OFF, is found in every segment, which we term the "ubiquitous" temporal cohort. The other temporal cohort types can be produced in various combinations depending on the stem cell division pattern and segmental location. In a result that could not have been predicted, we found that the ubiquitous temporal cohorts are refined both in terms of the number of neurons and their connectivity, depending on body region. In contrast, when other temporal cohort types are produced, they are not refined to the same degree.

      The impact of this work is to advance how we think about stem cell-based circuit assembly.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Summary: The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. *

      Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      • *Thank you! In addition to the contributions highlighted by the reviewer, we also showed that all segments have ELs with early-born molecular identities, but only a subset have ELs with late-born identities (Figure 5). And we showed that early-born temporal cohorts can be mapped into different circuits depending on the axial region (Figure 6).

      *Major comments: The authors performed careful analyses of the NB3-3 lineage using EL neurons. My main concerns are limited applicability of their findings and lack of mechanisms as how NB3-3 generate various numbers of EL neurons. Their findings are exclusively relevant to the NB3-3 lineage despite their effort in highlighting that other NB lineages also generate temporal cohorts of EL neurons. *

        Thank you for raising these points. First, to clarify, as Reviewer 4 also mentioned, NB3-3 is the only lineage to produce EL neurons. We will ensure that this is clearly stated in the revised text.
      

      We agree that our findings might not apply beyond the NB3-3 lineage. However, as this is the first study of its kind, it is impossible to know a priori to what extent the concepts surfaced here are generalizable. In our opinion, this speaks to the novelty and impact of the study. A contribution is to motivate a need for future studies. We will make this explicit in our updated manuscript in the Discussion section.

        Our manuscript provides cell biological mechanisms that explain how stem cells give rise to different numbers of EL neurons in different regions, including stem cell division duration and type, neural cell death, identity gene expression, and differentiation state. If the reviewer is interested in genetic or molecular mechanisms, this is an interesting point. Several prior studies using NB3-3 as a model (e.g., Tsuji et al., 2008, Birkholz et al., 2013, Baumgardt et al., 2014) have elucidated the genetic regulation of specific cell biological processes. However, these studies provided fragmentary insight with regard to serially homologous stem cell development along the body axis. A comprehensive understanding of how the NB3-3 lineage, or any other serially homologous lineage, develops was missing. This is what makes our study both novel and needed. Without an analysis that both examines every segment and assays multiple cell biological processes, we would have missed key insights: that there is a ubiquitous type of temporal cohort, and that neurons within the ubiquitous temporal cohort are selectively refined post-mitotically (See General Statements for more details).
      

      *I disagreed with their conclusion that failure to express Eve as a mechanism for controlling EL neuron numbers when Eve serves as the marker for these neurons. Are there any other strategy to assess the fates and functions of these cells beside relying solely on Eve expression? I am not familiar with the significance of Eve expression on the functions of these neurons. Is it possible to perform clonal analyses of NB3-3 mutant for Eve and see if these neurons adopt different functionalities/identities? *

      • We agree that if Eve were only a marker, our logic would be circular. The Eve homolog, Evx1/2 is crucial for vertebrate interneuron cell fate (Moran-Rivard et al., 2001). Eve is essential for motor neuron morphology in Drosophila *(Fujioka et al., 2003). Eve is critical in Even-skipped for both the morphology and function of Even-skipped interneurons (Marshall et al., 2022). Hence, ELs cannot fully differentiate or incorporate into circuits without Eve. Thus, we use the failure to express Eve as a mechanism for controlling EL number. Furthermore, our prior study (Wang et al., 2022) showed that NB3-3 Notch OFF neurons in A1 that fail to express Eve have small soma and "stick-like" neurite projections that are typical of undifferentiated neurons. We will be sure to add this context to the revised manuscript.

      *If NB3-3 in the SEZ continually generate GMCs based on the interpretation of clonal analyses and depicted in Fig. 2A, why is the percent of clones that are 1:0 virtually at or near 100% from division 6-11 shown in 2G? *

      Admittedly, the ts-MARCM heat-shock-based lineage tracing experiments are inherently messy. This is part of the reason why we included the G-TRACE lineage tracing experiments in Figure 3. In Figure 3E, one can see that the number of Notch ON/A neurons in SEZ3 is equal to the number of ELs in that segment (Figure 1E). This is a second independent method that supports the assertion that in SEZ, NB3-3 stem cells continually generate GMCs. Given this independent observation, it leads us to believe that this question is most likely explained by technical issues inherent in ts-MARCM. These issues include but are not limited to: cell-type specific accessibility/success of heat-shock induced recombination; variably effective RNAi; and idiosyncrasies of the EL-GAL4 line used to detect recombination events. If the question is why the data is only reported for division 6-11, the answer is that the ts-MARCM dataset, which included SEZ clones only used later heat-shock time points (line from the paper "for the SEZ-containing dataset, inductions started at NB3-3's 5th division"). Along with this revision plan, we will include Supplemental Table 1, which was inadvertently omitted from the previous submission due to our error. This table shows all of the clonal data. We will include a section in the discussion to describe limitations in ts-MARCM.

      The authors also indicate that NB3-3 in the abdomen directly generate Notch OFF/B cells that assume EL neuronal identity. In this scenario, shouldn't the percent of 1:0 clones be 100% in later divisions in Fig. 2G? Based on the number of clones in abdomen shown in Fig. 2E, I cannot seem to understand how the authors come to the percent of 1:0 clones shown in Fig. 2G

        We agree that one might expect the 12th division to be 100% 1:0 clones in the abdomen. Unfortunately, we didn't sample that late in our dataset, and even when we sampled the inferred 11th division, we had a small sample size (Figure 2E). Other studies suggest that NB3-3 in the abdomen directly generates Notch OFF/B neurons (Baumgardt et al., 2014), which served as our starting point. We will revise the text to make this clearer. As you can see from Figure 3E, there is only one NB3-3 Notch ON/ A neuron produced in each abdominal segment in comparison to the number of NB3-3 Notch OFF/B/EL neurons (Figure 1E). According to two independent assessments, Figure 3 and Baumgardt et al., 2014, the data support the conclusion that NB3-3 in the abdomen directly generates Notch OFF/B cells that assume EL identity for all but one of their divisions. Again, we believe technical issues make the ts-MARCM dataset messy. We will include a section in the discussion to describe limitations in ts-MARCM.
      

      *There are many potentially interesting questions related to this study that can significantly broaden the impact of this study. For example, are other NB lineages that also generate distinct temporal cohorts of EL neurons display similar proliferation patterns (type 1 division in SEZ, early termination of cell division in thoracic segments and type 0 division in abdomen)? *

      • *NB3-3 is the only lineage that makes ELs; Many lineages switch proliferation fates along the body axis. Previous studies have described how this switch in division patterns produces the wedge-shaped CNS: Cobeta et al., 2017. In the revision, we will be sure to clarify both points.

      *Why does NB3-3 in the thoracic segment become quiescence so much sooner than SEZ and abdominal segments? *

      • *NB3-3 in the thorax enters quiescence due to Hox genes and temporal transcription factors (Tsuji et al., 2008). In the revision, we will be sure to clarify this point.

      The authors' observations suggest that NB3-3 in SEZ and abdomen generate a similar number of EL neurons despite the difference in their division patterns (type 1 vs type 0). Are the mechanisms that promote EL neuron generate in NB3-3 in SEZ and abdomen the same? Anything else is known beside Notch OFF?

      • We agree this is an interesting point. Previous work has detailed NB3-3 division patterns, showing Type 1 divisions in the thorax, and Type 1 to Type 0 switch in the abdomen (Baumgardt et al., 2014). However, the proliferation pattern of NB3-3 in the SEZ had not been addressed until our study. Figures 2 and 3 suggest the following (1) SEZ proliferates for the duration of embryonic neurogenesis; (2) It produces a GMC on each division; (3) the GMC divides to produce one EL Notch OFF neuron and one Notch ON neuron. In our revision, we will manipulate the Notch pathway using two mutants, sanpodo, which produces two Notch OFF cells, and numb*, which produces two Notch ON cells (Skeath et al., 1998), to specifically test how ELs in the SEZ are regulated by Notch signaling. The other difference we know of between the SEZ, and abdomen is Hox gene expression. In Figure S2, we show that a subset of ELs in the SEZ express the anterior Hox genes, Sex combs reduced (Scr). The role of Hox genes in this lineage is an interesting question, as addressed in the discussion. This is an important future direction that merits in-depth study and is beyond the scope of what of this study is trying to accomplish.

      Minor commentsThe authors' writing style is highly unusual especially in the result section. There is an overwhelming large amount of background information in the result section but very thin description on their observations. The background information portion also includes previously published observations. Since the nature of this study is not hypothesis-driven, it is very confusing to read in many places and difficult to distinguish their original observations from previously published results and making. One easily achievable improvement is to insert relevant figure numbers into the text more often.

      Thank you for this comment. It is invaluable. In the revision, we will expand the background into a more comprehensive introduction and present the results more clearly. We will certainly insert relevant figure numbers. In responding to the reviewer's comments above, we can see where our writing lacked clarity and will improve these areas. Thank you again.

      Reviewer #1 (Significance (Required)):

      The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      Because this text is the same as the summary, please see our response to that section.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Vasudevan et al provide a detailed characterisation of the different numbers and temporal birthdates of Even-skipped Lateral (EL) neurons produced at in different segments from the same neuroblast, NB3-3. The work highlights the differences in EL neuronal generation across segments is achieved through a combination of different division patterns, failure to upregulate EL marker Eve and segment-specific program cell death. For neurons born within the same window and segment, the authors describe additional heterogeneity in their circuit formation. The work underscores the large diversity that the same neuroblast can generate across segments.

      Thank you!

      Major comments:

      - Based on the ts-MARCM 1:0 clones representing 100% of the SEZ clones at any given inferred cell division, the authors conclude "NB3-3 neuroblasts generate proliferative daughter GMCs in the SEZ and thorax on most divisions". Figure 2G does not have any data for SEZ before inferred division 5, whereas there is data in other regions. The authors also state "In the SEZ and abdomen, ELs were labelled regardless of induction time." In reference to Fig 2F, which seems inaccurate given there are no SEZ clones before inferred division 5. There is no comment on this fact, which is surprising give their focus on temporal cohorts. The authors should explain this discrepancy, if known, or modify their statements to reflect the data.

      • *Thank you for raising this point. The reason is because we produced two ts-MARCM datasets. One had SEZ clones, the other did not. The dataset with SEZ clones used heat shock protocols only for later time points, because those were most informative. The text from the paper is "We combined a published ts-MARCM (Wang et al., 2022) dataset with a new one (Table S1). The differences between the datasets are (1) CNSs were imaged either at low resolution for all regions (SEZ to terminus) or higher resolution for nerve cords (thorax to terminus); (2) for the SEZ-containing dataset, inductions started at NB3-3's 5th division. The combined data includes ~12 different heat shock protocols, 80 CNS, and 234 clones (Table S2)". In response to this comment, however, we will further clarify this point. In addition, we are submitting Supplemental table 1, which contains all the clonal data, as you can see experiments a-h lack SEZ data and experiments i-k contain SEZ data.

      - The temporal cohort (early-born vs late-born) identity is exclusively examined based on markers. Given the absence of SEZ clones from early NB3-3 divisions, a time course showing that the SEZ generate early-born Els or some other complementary method would be desirable.

      Thank you for raising this point. We show early-born versus late-born identity using markers in Figure 5. We conducted the time-course experiment as suggested and can confirm that there are early-born ELs in the SEZ at stage 13. We will include a new Supplemental Figure that includes a time course of EL number at stages 11, 13, 15, and 17 for segments SEZ3 to Te2 in the revision. See figure below.

      - The authors repeatedly refer to their work as showing how a stem cell type can have "flexibility". Flexibility would imply that NB3-3 from one segment could adopt a different behaviour (different division pattern, or cell death or connectivity) if it were placed in a different segment. This is not what is being shown. In my opinion, "heterogeneity" of the same neuroblast across different segments would be more appropriate.

      • *Thank you for this comment. We will change the wording to heterogeneity in the revision.

      Minor comments:

      - Figure 2A depicts a combination of known data and conclusions from their own (mainly SEZ). The authors might consider editing the figure to highlight what is new. A possibility would be for figure A to be a diagram of the experimental design and their summary division pattern to be shown after the new data instead of being panel A.

      Thank you for this suggestion. We will make the suggested change.

      - The authors state that they combined published ts-MARCM with their new one, which differed in a number ways that they list, but they don't specify which limitations are associated with the published vs new dataset. Could the authors please clarify?

        We now include Supplemental Table 1, which shows the complete combined datasets. In the first dataset, experiments a-h, the CNS was imaged at high resolution, but in a smaller region. The limitation is that the SEZ is missing. In the second dataset, i-k, inductions started at NB3-3's 5th division. The limitation is that we fail to sample early time points. This was a strategic decision. There were two possible scenarios: (1) in the SEZ, NB3-3 divided early, made GMCs, but both daughters expressed Eve. (2) in the SEZ, NB3-3 divided for the entirety of the embryonic neurogenesis, making GMCs, with only the Notch OFF daughters expressing Eve-our data support (2). Only late heat shocks were needed to distinguish between these possibilities. As these experiments are labor-intensive, we focused our efforts on the later time points. We will make this clearer in our revised text.
      

      - The title refers exclusively to "temporal cohorts", which in the manuscript are defined quite narrowly and do not seem to apply to all segments.

      • *Thank you! This, in our opinion, is a central, not a minor point to raise, because the impact of this study involves temporal cohort biology. We outlined the essential concepts in Part 1 "general statements" section of this revision plan. We did not mean to use "temporal cohort" in a limited sense, and we can see how the writing of our results section led to this comment. We will revise to make this clear.

      - Several cited references are missing from the Reference list at the end. Could the authors please double check this? (e.g. Matsushita, 1997; Sweeney et al., 2018)

      • *Thank you, we will remedy this!

      - Legend for figure 2 is a bit confusing, there is a "(A)" within the legend for (D), which indicates that segments A1-A7 are shown (this seems inaccurate, as it only goes to A6).

      Thank you, we will remedy this!

      Reviewer #3 (Significance (Required)):

      This study provides a comprehensive analysis of different cell biological scenarios for a neuroblast to generate distinct progeny across repeating axial units. The strength is the detailed and systematic approach across segments and possible scenarios: different division patterns, cell death, molecular marker expression. While it focuses on one specific neuroblast of the ventral nerve cord of Drosophila, the authors have done extensive work to place their findings and interpretation in the context of other cell types and across model organisms both in the introduction and discussion. This makes the work of interest for developmental biologists in general, neurodevelopment research in particular and those interested in circuit assembly, beyond their specialised community. This point of view comes from someone working in vertebrate CNS development.

      Thank you!

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      This manuscript addresses the question of how the number of neurons produced by each progenitor in the nervous system is determined. To address this question the authors use the Drosophila embryo model. They focus on a single type of neural stem cell (neuroblast), with homologues in each hemisegment along the anterior-posterior axis.

      Using a combination of clonal labelling, antibody stainings, and blockade of programmed cell death, they provide a detailed description of segment-specific differences in the proliferation patterns of these neuroblasts, as well as in the fate and survival of their neuronal progeny.

      Furthermore, by employing trans-synaptic labelling, they demonstrate that neurons derived from the same progenitor type receive distinct patterns of synaptic input depending on their segmental origin, in part due to their temporal window origin.

      Overall this work shows that different mechanisms contribute to the final number and identity of the neuronal progeny arising from a single progenitor, even within homologous progenitors along the anterior posterior body axis.

      Thank you!

      Major Comments

      I would suggest adding line numbers to the text for future submissions, this massively helps providing comments.

        Thank you for this comment. We will definitely add line numbers to the revised manuscript. We also thank you for providing comments despite this oversight on our part. We appreciate your time, and did not mean to make extra work.
      

      *The authors propose that all neuroblasts produce the same type of temporal cohort (early born) and that, by changing the pattern of cell division, different temporal cohorts can be added. The way this this presented in the abstract sounds like an obvious thing, what would be the alternative scenario/s? *

        Thank you for raising the point that the abstract should be updated. We have included a revised abstract. The things that are obvious are: (1) changing a neuroblast's division pattern will change the number of neurons produced, and (2) if you have late-born neurons, the stem cell must at some point, have made early-born neurons. However, within those bounds is an extremely large parameter space. Each stem cell can choose to divide or not, and it can also choose to produce a proliferative daughter or not. The stem cell must navigate these choices at every division. The field had two models for what a stem cell might do - a "random production" model and a "unitary production" model. Our data support a third "multimodal production" model, which could not have been predicted based on prior literature or data.
      

      We had raised these points in the discussion as follows-

      "Under a null model, the durations and types of proliferation would vary stochastically across segments, resulting in a continuous and unstructured distribution of neuron numbers (Llorca et al., 2019). In a unitary production model, based on the vertebrate neocortex, there is a fixed neurogenic output of ~8-9 neurons per progenitor (Gao et al., 2014). However, our data support a third model, a multimodal production model. In a multimodal model, serially homologous neuroblasts generate different numbers of neurons depending on the segment."

      We will now update the text to address this concern.

      Here it's the late born neurons that lack in thoracic segments because of early NB quiescence, but it cannot be excluded that different neuroblast types adopt a different strategy.

      • *True. Neural development is complex. Other lineages could easily employ alternative strategies. Our study presents a new conceptual framework that should inspire future research.

      I found the ts-MARCM results confusing for 2 reasons:

      1- It's not clear to me why there are so many single cell clones in div 3 and 4 in abdominal segments. This is not compatible with the division model depicted for abdominal segments, unless GMCs are produced in those division window and the MARCM hits the GMC, as also mentioned in the legend for G. This aspect is important because, either the previous model by Baumgardt et al. - please correct cit. currently Gunnar et al. 2026 - is wrong, or something strange happens in this experiment, or the relative temporal order is incorrect.

      Thank you for raising this point. Having multiple single-cell (i.e., 1:0) clones in divisions 3 and 4 is not precisely what would be predicted by the model in Figure 2C. In part because heat-shock-based recombination methods in fly are stochastic and inherently "messy", we also conducted a second set of lineage tracing experiments, as shown in Figure 3, using G-TRACE. Figure 3E shows one Notch ON/A neuron in each abdominal segment, suggesting there is only one GMC present during lineage progression. But Figure 3E's result does not localize the GMC to any particular division. One possibility is that the GMC is generated once, but randomly throughout lineage progression. This possibility is consistent with the idea that the relative temporal order is incorrect and suggests that Baumgardt is erroneous. However, the Baumgardt data are strong, so we do not favor this idea. A second possibility, which we favor, is that something strange happened in this experiment. Here is how we envision the strange occurrence: heterogeneity in the EL driver. Ts-MARCM's recombination timing dictates the upper limit for the number of cells within a clone. However, recombination is detected by GAL4. So, if the GAL4 driver for some reason detects fewer cells than one expects, then one would see unusually small clones as is the case in question. To detect Ts-MARCM recombination in Figure 2, we used the EL-GAL4 driver. The EL-GAL4 driver is an enhancer fragment, ~400KB, meaning that it does not capture the full regulatory context of the eve locus. In our experience (e.g., Manning et al., 2012), drivers using small enhancers tend to give highly-specific, but somewhat variable expression, and this is the case for EL-GAL4 in our experience. We will update the discussion to discuss the ts-MARCM dataset and its limitations. And, we will correct the citation to Baumgardt et al., 2014, not Gunnar. Thank you!

      2- In segments other than abdomen, it is quite rare to hit proper clones, it appears that only GMCs are hit by recombination, with very few exceptions. Could the author please provide an explanation for this or at least mention this aspect?

      • *This is true. We cannot explain it. It could have something to do with the RNAi cassettes that are used in ts-MARCM, because in the original paper they mention that RNAi can be differently regulated in GMCs versus neuroblasts (Yu et al., 2009). We will mention it in the revised discussion about ts-MARCM limitations.

      It is also unclear whether in F the graph includes all types of clones (including 1:0 clones). This is important, because the timing of division for NBs and GMCs is different, and inclusion of 1:0 might lead to a wrong estimate of the NB proliferation window (longer than it actually is because GMCs divide for longer). This is particularly important for the SEZ, where most clones in normalised division 10 and 11 are with ratio 1:0, thus compatible with both terminal division as well as GMC division.

      • *The graph in F does include all types of clones. We provide Supplemental Table 1, which shows the full dataset. Unfortunately, we do not have enough data to analyze only NB clones. We agree that the estimate of the NB proliferation window is coarse using this analysis method and could overrepresent the division time by one cell division. We will mention this in the discussion and make sure that our results text is free from any overreaching claims about the precision of these measurements.

      To obtain an estimate of the timing of division, the authors normalise clone size to the size of the bigger clone in the abdomen. What happened to those samples where no abdominal clones were hit? Were they simply excluded from the analysis?

        From the analysis in Figure 2, we excluded the clones that were SEZ, thorax, or terminus only. They were rare. They are shown in Supplemental Table 1, which will now be added in our revision plan.
      

      It is proposed that in the thorax late temporal cohort neurons are not produced, yet the ts-MARCM experiment detects some 1:0 clones. What is the fate of these cells? Are they all derived from GMC division and therefore decoupled from the temporal identity window? Or is this a re-activation of division?

      Figure 2F shows at the inferred 11th NB3-3 division, 100% of thoracic clones are of the 1:0 type. This is an n=1 observation (Supplemental Table 1, row f-Jan20-2). When we look at the morphology of this thoracic EL, we can see that it is a fully differentiated neuron that crosses the midline and ascends to the CNS, which is similar to EL morphologies in A1, so we don't think it's a whole new cell type. We have no way of determining whether this neuron was derived from a GMC division. It is also possible that this is an infrequent event or a technical anomaly. To address the question of reactivation of the thoracic NB3-3 division, we plan to include a Supplemental Figure of EL number over developmental time (stages 11, 13, 15, 17) for segments SEZ3 to Te2. This is the same data that we mentioned to Reviewer 3. This will reveal the extent to which the thorax produces late-born ELs.

      *"in A1, a majority of segments had one Notch OFF/B neuron that failed to label with Eve" does "the majority" in this sentence mean that there were cases where all B neurons were labelled with Eve? If yes, where would this stochasticity come from? *

        • Yes, "the majority" in this sentence means that there were cases where all B neurons were labeled by Eve. In Figure 3F, for segment A1, that number is four. In contrast, there are 6 cases where B neurons failed to label with Eve. We can only speculate about the origin of the stochasticity. It could be biological (e.g., low level of Eve expression) or technical (e.g., poor antibody penetration). We plan to mention this in the discussion.

      Additionally, there is no evidence that it's the first born NotchOFF neuron in A1 that does not express Eve. The authors should clarify where this speculation comes from.

      • *The evidence that the first-born Notch OFF neuron in A1 does not express Eve comes from our ts-MARCM data: "So far, our ts-MARCM analyses grouped segments into regions (Figure 2A-C), however, EL number varies on a segment-by-segment basis (Figure 1). Therefore, we looked for segment-by-segment differences in ts-MARCM data (Table S1). The only detectable difference was between A1 and the other abdominal segments: When both A1 and another abdominal segment were labeled in a single CNS, a majority had smaller A1 clones. These data suggest that the production of ELs by NB3-3 neuroblasts lags in A1 compared to A2-A7." We will add a representation of these data to the ts-MARCM figure. As we stated above, we will add a Supplemental Figure of EL number over developmental time (stages 11, 13, 15, 17) for segments SEZ3 to Te2, which could strengthen this point.

      When discussing trends shared with other phyla:

      A- "In the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013). Analogously, EL numbers do not smoothly taper from anterior to posterior; instead, the largest number of ELs is found in two non-adjacent regions, SEZ and the abdomen." It's unclear what is the link between the figure in the mammalian spinal cord and the Drosophila embryo. The embryo doesn't even have limbs and the number of neurons measured here refer only to a single lineage, while there could be (and in fact there are) lineage-to-lineage differences that could depict a different scenario.

      Thank you for this comment. We will rewrite this sentence, "in the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013)" to more accurately reflect the data in the Francius paper, and make the parallel more explicit. We will say "the size of columns of V3, V1, V2a, V2b, and V0v neurons differ at brachial compared to lumbar levels in the developing spinal cord." This removes the confusion about limbs and somewhat mitigates the concern about lineage-to-lineage differences, at least from the perspective of the spinal cord.

      B- The parallelism between V1 mouse neurons and EL Drosophila neurons is also unclear to me. The similarity in fold change across segments could be a pure coincidence and, from what I understand, the two cell types are not functionally linked.

        Thank you for this comment. We believe this is the sentence in question (sorry about no line numbers). "(3) In the mouse spinal cord, ~10-fold differences in molecular subtypes for V1 neurons (Sweeney et al., 2018). In *Drosophila*, NB3-3 neuroblasts show differences in EL number, depending on region, with similar fold changes, suggesting this trait is shared across phyla."  The emphasis was intended to be on the fold-changes, not cell types. Coincidence or not, it is parallel. We will update the sentence to say "(3) In the mouse spinal cord, ~10-fold differences in molecular subtypes for V1 neurons (Sweeney et al., 2018). Although V1 neurons are not direct homologs of EL neurons, the number also varies ~10-fold depending on the region. One possibility is that this trait is shared across phyla." And, we will remove the final part of the paragraph, which distracts from the point "Thus, for this study and future research, NB3-3 development now offers a uniquely tractable, detailed, and comprehensive model for studying how stem cells flexibly produce neurons."
      

      Minor comments:

      I found the manuscript somewhat difficult to follow, even though I am familiar with both the model and the topic. For non-specialist readers, I expect it will be even more challenging. The presentation of the results often feels fragmented, at times resembling a sequence of brief statements rather than a continuous narrative. I would encourage the authors to provide more synthesis and interpretation, for example by summarising key findings, rather than listing in detail the number of neurons labelled in each segment for every experiment. This would make the results more accessible and easier to digest.

      • *Thank you for this comment. We will provide more synthesis and interpretation in results by summarizing key findings.

      From the way the MS is written it's not clear from the beginning that the work focuses exclusively on embryonic-born neurons. Since in Drosophila neuronal stem cells undergo two rounds of neurogenesis, one in the embryo and one in the larva, this omission could lead to confusion.

        Thank you for this comment. We will mention this in the abstract, introduction and discussion.
      

      In the abstract, what would be the other temporal cohorts generated in specific regions? (ref to: "In specific regions, NB3-3 neuroblasts produce additional types of temporal cohorts, including but not limited to the late-born EL temporal cohort.")

        In this manuscript, we use lineage tracing to identify four types of temporal cohorts- early-born Notch ON, early-born Notch OFF, late-born Notch ON, and late-born Notch OFF. This is now reflected in the revised abstract. ELs are early-born Notch OFF and/or late-born Notch OFF.
      

      This sentence in the introduction is inaccurate: "The Drosophila CNS is

      organized into an anterior hindbrain-like subesophageal zone (SEZ) and a posterior spinal cord-like nerve cord". The anterior hindbrain-like portion of the CNS is in fact the supraesophageal ganglion (or cerebrum), while the SEZ is a posterior-like region.

        Thank you. We will change this sentence to: "The *Drosophila* CNS is
      

      organized into a hindbrain-like subesophageal zone (SEZ) and a spinal cord-like nerve cord".

      Fig 1E: the encoding of the significance is not immediately clear. In the legend the 4 stars could also be arranged in the same way for clarity.

      • *Thank you. We will change it for clarity.

      Fig 2E legend: it is mentioned that B corresponds to a 1:4 clone, however the MARCM example is shown for C and it's a 1:5.

      Thank you. We will fix this.

      The occurrence of "undifferentiated" neurons in Th segments is in less than 10% of the clones, I wonder if this a stochastic or deterministic event and to what extent small cell bodies could just be the consequence of local differences in tissue architecture.

      • Because we are using a stochastic technique, it is difficult for us to determine whether the occurrence of neurons with small somas is a stochastic or deterministic event. Several papers suggest neurons with small axons are found across insect species (Pearson and Fourtner, 1975; Burrows, 1996). Neurons with a small soma and short axons/ axonless are found in the Drosophila embryonic abdominal nerve cord (Lacin et al., 2009). In our unpublished work from the Drosophila* nerve cord at a first instar larval stage, we found small somas with short axons in segment A1 (see Figure 4.6 below). This leads us to believe it is not a consequence of local tissue architecture.

      Fig 2I: it's unclear what the purple means (I suppose it might be Eve expression) and why in J there should be one purple cell not labelled by the ts-MARCM when this is not present in H and I.

      Purple is Eve. We will add labels for stains used in H and I, and remove the extra purple cell from the illustration in J.

      "When synapses do occur, they are numerically similar from segment to segment". It's unclear where the evidence for this statement comes from, please clarify or remove the sentence.

      We calibrated our trans-Tango data against available connectomic data using segment A1 as a reference. We learned that the trans-tango method only identifies strongly (>15 synapses) connected neurons.

      "First, we calibrated trans-Tango for use in larval Drosophila, focusing on segment A1, where connectome data are available (Wang et al., 2022). In the connectome, of the five early-born ELs in A1, three are strongly connected to CHOs (>15 synapses), two are weakly connected (15 synapses) connected to somatosensory neurons."

        We will modify this sentence to say "when synapses do occur they are of similar strengths from segment to segment"
      

      "In SEZ2, NB3-3 divides 10 times (Figure 2F)". Figure 2F does not support this statement and Figure 7 shows 12 divisions. Possibly SEZ2 and 3 have been inverted in this statement, please clarify.

      Thank you for pointing this out. We will correct it!

      **Referees cross-commenting**

      I agree with most of the comments/suggestions provided by the other two reviewers.

      In particular:

      I agree with reviewer #1's comment about failure to express Eve being a mechanism for controlling neurons number, as this is a circular argument.

      • *We address this earlier and direct you to that text. Briefly, Eve is not just a marker, but a key differentiation gene for ELs.

      I agree with reviewer #2's concern about the use of the word "flexibility"; "heterogeneity" would be a more appropriate term, as I would associate the word "flexibility" to the ability of a single neuroblast in a single segment to produce neurons with different fates under, for example, unusual growth conditions. Here no genetic/epigenetic manipulations were performed to address flexibility and the observed (stereotypical) differences result from axial patterning.

      • *We will change this, thank you.

      *As a note, Reviewer #1 asks about other temporal cohorts of EL neurons produced by other lineages, but these neurons are specifically generated from NB3-3. *

      • *Thank you for adding this clarification.

      To generalise the observations reported in this study, the authors would need to focus on other molecularly defined temporal cohorts or, more generally, on other lineages, which, however, are likely to adopt different combinations of mecahnisms to tune progeny number across segments.

      • *We agree that further studies are needed to assess the generalizability of our findings.

      Reviewer #4 (Significance (Required)):

      In Drosophila melanogaster, the relationship between neural progenitors and their neuronal progeny has been studied in great detail. This work has provided a comprehensive description of the number of progenitors present in each embryonic segment, their molecular identities, the number of neurons they produce, and the temporal transcriptional cascades that couple progenitor temporal identity to neuronal fate.

      This work adds to the existing knowledge a detailed characterisation of intersegmental differences in the pattern of proliferation of a single type of neuronal progenitor as well as in post-divisional fate depending on anterior-posterior position in the body axis (i.e. programmed cell death and Notch signalling activation). This is a first step towards understanding the cellular and molecular mechanisms underlying such differences, but it's not disclosing them.

      We have disclosed the cellular mechanisms- stem cell division duration and type, neural cell death, identity gene expression, and differentiation state -unless something else is envisaged by this comment. The molecular mechanisms are beyond the scope of this paper.

      That homologous neuroblasts can generate variable numbers of progeny neurons depending on their segmental position has been established previously. What this manuscript adds is the demonstration that these differences arise through a combination of altered division patterns and differential programmed cell death, thereby revealing a more complex and less predictable scenario than could have been anticipated from existing knowledge in other contexts. The advance provided by this study is therefore incremental, refining rather than overturning our understanding of how segmental diversity in neuroblast lineages is achieved.

      The key conceptual advances provided by this study are described in the General Statements section above. We don't overturn, but we advance the field.

      By touching on the general question of how progenitors generate diversity, this work could be of broad interest to developmental neuroscientists beyond the fly field. However, the way it is currently written does not make it very accessible to non-specialists.

      Thank you for this comment. We will endeavor to make it more accessible in the revised manuscript. Reviewer 3, an expert in vertebrate neurobiology, agreed that our work was of broad interest.

      My expertise: Drosophila neurodevelopment, nerve cord, cell types specification

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      With this Revision Plan, we submit a revised abstract, and a supplemental table 1. We plan to address every point raised by the reviewers.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript addresses the question of how the number of neurons produced by each progenitor in the nervous system is determined. To address this question the authors use the Drosophila embryo model. They focus on a single type of neural stem cell (neuroblast), with homologues in each hemisegment along the anterior-posterior axis.

      Using a combination of clonal labelling, antibody stainings, and blockade of programmed cell death, they provide a detailed description of segment-specific differences in the proliferation patterns of these neuroblasts, as well as in the fate and survival of their neuronal progeny. Furthermore, by employing trans-synaptic labelling, they demonstrate that neurons derived from the same progenitor type receive distinct patterns of synaptic input depending on their segmental origin, in part due to their temporal window origin. Overall this work shows that different mechanisms contribute to the final number and identity of the neuronal progeny arising from a single progenitor, even within homologous progenitors along the anterior posterior body axis.

      Major Comments

      I would suggest adding line numbers to the text for future submissions, this massively helps providing comments.

      The authors propose that all neuroblasts produce the same type of temporal cohort (early born) and that, by changing the pattern of cell division, different temporal cohorts can be added. The way this this presented in the abstract sounds like an obvious thing, what would be the alternative scenario/s? Here it's the late born neurons that lack in thoracic segments because of early NB quiescence, but it cannot be excluded that different neuroblast types adopt a different strategy.

      I found the ts-MARCM results confusing for 2 reasons:

      1. It's not clear to me why there are so many single cell clones in div 3 and 4 in abdominal segments. This is not compatible with the division model depicted for abdominal segments, unless GMCs are produced in those division window and the MARCM hits the GMC, as also mentioned in the legend for G. This aspect is important because, either the previous model by Baumgardt et al. - please correct cit. currently Gunnar et al. 2026 - is wrong, or something strange happens in this experiment, or the relative temporal order is incorrect.
      2. In segments other than abdomen, it is quite rare to hit proper clones, it appears that only GMCs are hit by recombination, with very few exceptions. Could the author please provide an explanation for this or at least mention this aspect? It is also unclear whether in F the graph includes all types of clones (including 1:0 clones). This is important, because the timing of division for NBs and GMCs is different, and inclusion of 1:0 might lead to a wrong estimate of the NB proliferation window (longer than it actually is because GMCs divide for longer). This is particularly important for the SEZ, where most clones in normalised division 10 and 11 are with ratio 1:0, thus compatible with both terminal division as well as GMC division.

      To obtain an estimate of the timing of division, the authors normalise clone size to the size of the bigger clone in the abdomen. What happened to those samples where no abdominal clones were hit? Were they simply excluded from the analysis?

      It is proposed that in the thorax late temporal cohort neurons are not produced, yet the ts-MARCM experiment detects some 1:0 clones. What is the fate of these cells? Are they all derived from GMC division and therefore decoupled from the temporal identity window? Or is this a re-activation of division?

      "in A1, a majority of segments had one Notch OFF/B neuron that failed to label with Eve" does "the majority" in this sentence mean that there were cases where all B neurons were labelled with Eve? If yes, where would this stochasticity come from? Additionally, there is no evidence that it's the first born NotchOFF neuron in A1 that does not express Eve. The authors should clarify where this speculation comes from. When discussing trends shared with other phyla:

      A- "In the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013). Analogously, EL numbers do not smoothly taper from anterior to posterior; instead, the largest number of ELs is found in two non-adjacent regions, SEZ and the abdomen." It's unclear what is the link between the figure in the mammalian spinal cord and the Drosophila embryo. The embryo doesn't even have limbs and the number of neurons measured here refer only to a single lineage, while there could be (and in fact there are) lineage-to-lineage differences that could depict a different scenario.

      B- The parallelism between V1 mouse neurons and EL Drosophila neurons is also unclear to me. The similarity in fold change across segments could be a pure coincidence and, from what I understand, the two cell types are not functionally linked.

      Minor comments:

      I found the manuscript somewhat difficult to follow, even though I am familiar with both the model and the topic. For non-specialist readers, I expect it will be even more challenging. The presentation of the results often feels fragmented, at times resembling a sequence of brief statements rather than a continuous narrative. I would encourage the authors to provide more synthesis and interpretation, for example by summarising key findings, rather than listing in detail the number of neurons labelled in each segment for every experiment. This would make the results more accessible and easier to digest.

      From the way the MS is written it's not clear from the beginning that the work focuses exclusively on embryonic-born neurons. Since in Drosophila neuronal stem cells undergo two rounds of neurogenesis, one in the embryo and one in the larva, this omission could lead to confusion.

      In the abstract, what would be the other temporal cohorts generated in specific regions? (ref to: "In specific regions, NB3-3 neuroblasts produce additional types of temporal cohorts, including but not limited to the late-born EL temporal cohort.")

      This sentence in the introduction is inaccurate: "The Drosophila CNS is organized into an anterior hindbrain-like subesophageal zone (SEZ) and a posterior spinal cord-like nerve cord". The anterior hindbrain-like portion of the CNS is in fact the supraesophageal ganglion (or cerebrum), while the SEZ is a posterior-like region.

      Fig 1E: the encoding of the significance is not immediately clear. In the legend the 4 stars could also be arranged in the same way for clarity.

      Fig 2E legend: it is mentioned that B corresponds to a 1:4 clone, however the MARCM example is shown for C and it's a 1:5.

      The occurrence of "undifferentiated" neurons in Th segments is in less than 10% of the clones, I wonder if this a stochastic or deterministic event and to what extent small cell bodies could just be the consequence of local differences in tissue architecture.

      Fig 2I: it's unclear what the purple means (I suppose it might be Eve expression) and why in J there should be one purple cell not labelled by the ts-MARCM when this is not present in H and I.

      "When synapses do occur, they are numerically similar from segment to segment". It's unclear where the evidence for this statement comes from, please clarify or remove the sentence.

      "In SEZ2, NB3-3 divides 10 times (Figure 2F)". Figure 2F does not support this statement and Figure 7 shows 12 divisions. Possibly SEZ2 and 3 have been inverted in this statement, please clarify.

      Referees cross-commenting

      I agree with most of the comments/suggestions provided by the other two reviewers. In particular: I agree with reviewer #1's comment about failure to express Eve being a mechanism for controlling neurons number, as this is a circular argument. I agree with reviewer #2's concern about the use of the word "flexibility"; "heterogeneity" would be a more appropriate term, as I would associate the word "flexibility" to the ability of a single neuroblast in a single segment to produce neurons with different fates under, for example, unusual growth conditions. Here no genetic/epigenetic manipulations were performed to address flexibility and the observed (stereotypical) differences result from axial patterning. As a note, Reviewer #1 asks about other temporal cohorts of EL neurons produced by other lineages, but these neurons are specifically generated from NB3-3. To generalise the observations reported in this study, the authors would need to focus on other molecularly defined temporal cohorts or, more generally, on other lineages, which, however, are likely to adopt different combinations of mecahnisms to tune progeny number across segments.

      Significance

      In Drosophila melanogaster, the relationship between neural progenitors and their neuronal progeny has been studied in great detail. This work has provided a comprehensive description of the number of progenitors present in each embryonic segment, their molecular identities, the number of neurons they produce, and the temporal transcriptional cascades that couple progenitor temporal identity to neuronal fate. This work adds to the existing knowledge a detailed characterisation of intersegmental differences in the pattern of proliferation of a single type of neuronal progenitor as well as in post-divisional fate depending on anterior-posterior position in the body axis (i.e. programmed cell death and Notch signalling activation). This is a first step towards understanding the cellular and molecular mechanisms underlying such differences, but it's not disclosing them.

      That homologous neuroblasts can generate variable numbers of progeny neurons depending on their segmental position has been established previously. What this manuscript adds is the demonstration that these differences arise through a combination of altered division patterns and differential programmed cell death, thereby revealing a more complex and less predictable scenario than could have been anticipated from existing knowledge in other contexts. The advance provided by this study is therefore incremental, refining rather than overturning our understanding of how segmental diversity in neuroblast lineages is achieved. By touching on the general question of how progenitors generate diversity, this work could be of broad interest to developmental neuroscientists beyond the fly field. However, the way it is currently written does not make it very accessible to non-specialists.

      My expertise: Drosophila neurodevelopment, nerve cord, cell types specification

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Vasudevan et al provide a detailed characterisation of the different numbers and temporal birthdates of Even-skipped Lateral (EL) neurons produced at in different segments from the same neuroblast, NB3-3. The work highlights the differences in EL neuronal generation across segments is achieved through a combination of different division patterns, failure to upregulate EL marker Eve and segment-specific program cell death. For neurons born within the same window and segment, the authors describe additional heterogeneity in their circuit formation. The work underscores the large diversity that the same neuroblast can generate across segments.

      Major comments:

      • Based on the ts-MARCM 1:0 clones representing 100% of the SEZ clones at any given inferred cell division, the authors conclude "NB3-3 neuroblasts generate proliferative daughter GMCs in the SEZ and thorax on most divisions". Figure 2G does not have any data for SEZ before inferred division 5, whereas there is data in other regions. The authors also state "In the SEZ and abdomen, ELs were labelled regardless of induction time." In reference to Fig 2F, which seems inaccurate given there are no SEZ clones before inferred division 5. There is no comment on this fact, which is surprising give their focus on temporal cohorts. The authors should explain this discrepancy, if known, or modify their statements to reflect the data.
      • The temporal cohort (early-born vs late-born) identity is exclusively examined based on markers. Given the absence of SEZ clones from early NB3-3 divisions, a time course showing that the SEZ generate early-born Els or some other complementary method would be desirable.
      • The authors repeatedly refer to their work as showing how a stem cell type can have "flexibility". Flexibility would imply that NB3-3 from one segment could adopt a different behaviour (different division pattern, or cell death or connectivity) if it were placed in a different segment. This is not what is being shown. In my opinion, "heterogeneity" of the same neuroblast across different segments would be more appropriate.

      Minor comments:

      • Figure 2A depicts a combination of known data and conclusions from their own (mainly SEZ). The authors might consider editing the figure to highlight what is new. A possibility would be for figure A to be a diagram of the experimental design and their summary division pattern to be shown after the new data instead of being panel A.
      • The authors state that they combined published ts-MARCM with their new one, which differed in a number ways that they list, but they don't specify which limitations are associated with the published vs new dataset. Could the authors please clarify?
      • The title refers exclusively to "temporal cohorts", which in the manuscript are defined quite narrowly and do not seem to apply to all segments.
      • Several cited references are missing from the Reference list at the end. Could the authors please double check this? (e.g. Matsushita, 1997; Sweeney et al., 2018)
      • Legend for figure 2 is a bit confusing, there is a "(A)" within the legend for (D), which indicates that segments A1-A7 are shown (this seems inaccurate, as it only goes to A6).

      Significance

      This study provides a comprehensive analysis of different cell biological scenarios for a neuroblast to generate distinct progeny across repeating axial units. The strength is the detailed and systematic approach across segments and possible scenarios: different division patterns, cell death, molecular marker expression. While it focuses on one specific neuroblast of the ventral nerve cord of Drosophila, the authors have done extensive work to place their findings and interpretation in the context of other cell types and across model organisms both in the introduction and discussion. This makes the work of interest for developmental biologists in general, neurodevelopment research in particular and those interested in circuit assembly, beyond their specialised community. This point of view comes from someone working in vertebrate CNS development.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      Major comments: The authors performed careful analyses of the NB3-3 lineage using EL neurons. My main concerns are limited applicability of their findings and lack of mechanisms as how NB3-3 generate various numbers of EL neurons. Their findings are exclusively relevant to the NB3-3 lineage despite their effort in highlighting that other NB lineages also generate temporal cohorts of EL neurons. I disagreed with their conclusion that failure to express Eve as a mechanism for controlling EL neuron numbers when Eve serves as the marker for these neurons. Are there any other strategy to assess the fates and functions of these cells beside relying solely on Eve expression? I am not familiar with the significance of Eve expression on the functions of these neurons. Is it possible to perform clonal analyses of NB3-3 mutant for Eve and see if these neurons adopt different functionalities/identities? If NB3-3 in the SEZ continually generate GMCs based on the interpretation of clonal analyses and depicted in Fig. 2A, why is the percent of clones that are 1:0 virtually at or near 100% from division 6-11 shown in 2G? The authors also indicate that NB3-3 in the abdomen directly generate Notch OFF/B cells that assume EL neuronal identity. In this scenario, shouldn't the percent of 1:0 clones be 100% in later divisions in Fig. 2G? Based on the number of clones in abdomen shown in Fig. 2E, I cannot seem to understand how the authors come to the percent of 1:0 clones shown in Fig. 2G

      There are many potentially interesting questions related to this study that can significantly broaden the impact of this study. For example, are other NB lineages that also generate distinct temporal cohorts of EL neurons display similar proliferation patterns (type 1 division in SEZ, early termination of cell division in thoracic segments and type 0 division in abdomen)? Why does NB3-3 in the thoracic segment become quiescence so much sooner than SEZ and abdominal segments? The authors' observations suggest that NB3-3 in SEZ and abdomen generate a similar number of EL neurons despite the difference in their division patterns (type 1 vs type 0). Are the mechanisms that promote EL neuron generate in NB3-3 in SEZ and abdomen the same? Anything else is known beside Notch OFF?

      Minor comments:

      The authors' writing style is highly unusual especially in the result section. There is an overwhelming large amount of background information in the result section but very thin description on their observations. The background information portion also includes previously published observations. Since the nature of this study is not hypothesis-driven, it is very confusing to read in many places and difficult to distinguish their original observations from previously published results and making. One easily achievable improvement is to insert relevant figure numbers into the text more often.

      Significance

      The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This work by Matsui et al. examined the function of a gene Stand Stil (stil) in Drosophila in regulation of germ cell death in the female germline. They show that stil mutants contain many apoptotic cells, leading to germ cell loss and infertility. Gene expression analysis showed upregulation of pro-apoptotic genes such as rpr in stil mutant. DamID experiment further showed that stil binds to rpr promoter region to repress its expression. Additionally, they also show that undifferentiated germ cells are resistant to cell death in stil mutant (but stil mutant still eventually loses all germ cells).

      Major comments: Overall, experiments adhere to a general standard of rigor, and each result is fairly convincing. In that sense, this paper warrants publication, as a paper that revealed a new gene important for preventing germ cell death. With that said, I feel that this paper does not reveal a new biological insight. In a nutshell, this paper is about a transcriptional repressor for pro-apoptotic gene, hence its depletion leads to cell death. Data is solid and the conclusion is well supported. But the readers will be left wondering why nature implemented such control? Unless one can show what kind of defects stil rpr double mutant (which rescues germ cell loss phenotype) exhibits, there is no insight why the balance of pro-apoptotic gene and its repressor is important. The paper discusses the 'molecular' mechanisms that explain the phenomenon, but it does not provide insights. The lack of conceptual advancement is the limitation of this work.

      Response:

      We thank the reviewer for pointing out a biological insight into the evolutionary rationale underlying the adoption of such a regulatory mechanism in nature. To address this point, we assessed the evolutionary conservation of rpr and stil through BLAST searches and comparative analyses. Our results showed that both genes are Diptera-restricted, whereas their key domains (the rpr IAP-binding motif and the Stil BED finger) are widely conserved across metazoans. In this phylogenetic context, we propose that Stil acts as a dedicated repressor of rpr in the Drosophila female germline, thereby establishing an apoptotic control architecture in which hid predominates and rpr is repressed by Stil. This explains why the balance between a potent effector (Rpr) and its repressor (Stil) is critical in oogenesis; preventing catastrophic germline loss while preserving hid-mediated responsiveness.

      We have incorporated these phylogenetic analyses and the perspective into the revised Discussion section as follows.

      Revised Page 22, Line 475; rpr is conserved only within Diptera, although its IAP-binding motif, essential for apoptosis induction, is broadly conserved across metazoans (Du et al., 2000; Gottfried et al., 2004; Hegde et al., 2002; Shi, 2002; Verhagen et al., 2000; Vucic et al., 1998; Wing et al., 2001; L. Zhou, 2005) (Fig. S7). Similarly, stil is also restricted to Diptera, predominantly within Drosophila, whereas its BED-type zinc finger domain is widely conserved among diverse organisms (Aravind, 2000; Hayward et al., 2013; Tue et al., 2017b; H. Zhou et al., 2016). Phylogenetic patterns across Diptera are consistent with a model in which stil acts as a dedicated repressor of rpr in the Drosophila germline cells (Fig. S7). Due to its potent pro-apoptotic activity, rpr must be stringently repressed in a spatiotemporal manner through mechanisms that are specific to both cell type and developmental stage. During embryogenesis, repression of rpr is mediated by the Dpp-signaling factor Shn, which binds to the rpr regulatory region, whereas in intestinal stem cells (ISCs), its expression is suppressed through chromatin conformation. In Drosophila female germline cells, hid serves as the primary regulator of apoptosis, while rpr activity is generally suppressed (Park et al., 2019; Xing et al., 2015). However, rpr mutants exhibit reduced fertility despite producing viable eggs (Fig. 3H), suggesting that rpr-mediated apoptosis may be required for proper egg development. Accordingly, we propose that stil restrains rpr in the Drosophila female germline, allowing hid to predominate in apoptotic regulation.

      New Fig. S7;

      The legend of new Fig. S7;

      Figure S7 Conservation of Rpr and Stil within Diptera

      Homologs of Drosophila melanogaster Rpr and Stil were identified by BLASTp, aligned, and analyzed phylogenetically. Homologs are present across Dipteran lineages, with the genus Drosophila highlighted in blue. Branch lengths indicate the expected number of substitutions per site, as shown by the scale bar.

      Minor comments: Although this is a minor point, and this is not specifically pointing a finger at the author of this paper, I really don't like the term 'safeguard'. This term is now overutilized to add hype to papers, when 'is necessary' is sufficient. In this case, unless the answer is provided as to 'against what stil is safeguarding germ cells', this term is not meaningful. For example, if one can show that stil specifically senses germline-specific threat and tweaks the regular apoptotic pathway based on germline-specific needs, then the term 'safeguard' may be warranted.

      Response:

      In light of the reviewer's comment, we have revised the title of the manuscript to replace 'safeguard' with 'ensure,' which better reflects the demonstrated function of Stil without overstating its role. The new title of the manuscript is: 'Transcriptional Repression of reaper by Stand Still Ensures Female Germline Development in Drosophila'

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this well-executed study, Matsui et al. investigate how the female Drosophila germline prevents inappropriate apoptosis during development. They identify stand still (stil) as a key germline-specific repressor of apoptosis. Stil mutant flies are homozygous viable but female sterile due to widespread germ cell loss at the time of eclosion, which is driven by activation of the pro-apoptotic gene reaper (rpr) and caspase-dependent cell death. Germline-specific expression of anti-apoptotic factors such as p35 can rescue this phenotype, confirming that the defect lies in apoptotic regulation. The authors show that Stil directly represses rpr transcription through its BED-type zinc finger domain. Notably, undifferentiated germline cells remain resistant to apoptosis in the absence of stil, which the authors attribute to a silenced chromatin state at the rpr locus, marked by H3K9me3. These findings support a dual mechanism of protection: transcriptional repression of rpr by Stil, and a potential parallel chromatin-based silencing mechanism operating specifically in undifferentiated cells.

      Major Issues:

      1. Clarify cell identity in Figure 2E: It is unclear whether the apoptotic cells shown are somatic or germline in origin. Including a somatic marker such as 1B1 would allow the reader to clearly distinguish the apoptotic population and better interpret the figure.

      Response:

      We thank the reviewer for this helpful suggestion. Occasionally, the signal of the germline marker Vasa can be attenuated in dying germline cells. As suggested by the reviewer, we also tested α-Spectrin (a plasma membrane and fusome marker) instead of 1B1 together with TUNEL labeling, but this approach did not clearly distinguish somatic from germline apoptotic cells. To directly clarify cell identity, we now provide an improved co-stained image in which TUNEL-positive nuclei are surrounded by Vasa-positive cytoplasm, indicating a germline origin. Figure 2E has been updated accordingly.

      New Fig. 2E;

      Quantification of undifferentiated cells in mutants: There appears to be inconsistency in the representation of undifferentiated germ cells across figures. Early panels show near-complete germline loss, while later analyses focus on undifferentiated cells that are reportedly apoptosis-resistant. The authors should quantify the proportion of ovarioles retaining undifferentiated cells and present this data in Figure 1 or the supplements to resolve this discrepancy.

      Response:

      Thank you for raising the important point regarding the apparent inconsistency in the representation of undifferentiated germ cell populations. In early panes (Fig.1C, D), we analyzed adult ovaries of stil loss-of function mutants where all germline cells including undifferentiated germline stem cells (GSCs) are almost completely lost (Fig. 1C), showing nearly 100% agametic ovarioles. However, in later analysis such as those in Fig. 5A, B, we showed 3rd instar-larval ovaries of stil loss-of function mutants containing a few surviving germline cells nearby the future cap cell, the niche providing stem cell ligand, Decapentaplegic (Dpp) (Xie & Spradling, 1998). This suggests that Dpp-responsive undifferentiated germline cells may be relatively resistant to apoptosis caused by stil loss.

      Indeed, the GSC-like cells generated by the overexpression of a constitutively active form of Dpp receptor, Thickveins (Tkv.CA) or loss of the differentiation factor bam, were resistant to apoptosis caused by stil loss (Fig. 5C, D). These GSC-like cells may possess enhanced stemness, owing to either excessively elevated Dpp signaling or complete loss of bam, which could lead to stronger repression of rpr expression through tighter chromatin compaction.

      We added this argument in the Results section of the revised manuscript as follows.

      Revised Page 16, Line 361; Compared to GSCs, which were almost completely lost in stil mutants, GSC-like cells may retain a more robust stemness owing to the extremely elevated Dpp signaling pathway, potentially resulting in stronger repression of rpr expression.

      Interpretation of chromatin state at the rpr locus: The claim that H3K9me3, but not H3K27me3, marks the rpr locus is not fully convincing given the low ChIP-seq signal shown. Including a comparison to a known positive control locus would strengthen the argument. Alternatively, the authors could broaden the discussion to include global chromatin reorganization during germ cell to maternal transition, as reported in Kotb et al., 2024 and how such changes may impact rpr accessibility. Also stl mutant rescued with P53 have a "string of pearls" phenotype that are associated with germ cell to maternal transition defects (Figure S3, p53 OE)

      Response:

      We thank the reviewer for the thoughtful and constructive comment regarding the interpretation of chromatin state at the rpr locus. To strengthen the inference that the rpr locus shows H3K9me3 enrichment, whereas clear H3K27me3 enrichment is not evident, we have now included ChIP-seq signal profiles for known positive control loci, using light (lt) as an H3K9me3-enriched locus (Akkouche et al., 2017; Greil et al., 2003) and Ultrabithorax (Ubx) as a canonical H3K27me3 target (Torres-Campana et al., 2022). These comparisons support our interpretation that H3K9me3, rather than H3K27me3, characterize chromatin around the rpr locus in GSCs. Accordingly, while we do not exclude a minor H3K27me3 contribution, our analyses indicate H3K9me3 as the predominant signature at rpr in GSCs.

      New Fig.6B and 6C;

      The legend of new Fig. 6B and Fig. 6C;

      (B) H3K9me3 ChIP-seq signal at the rpr locus and the lt locus (H3K9me3-positive control) in GSCs and 4C NCs. (C) H3K27me3 ChIP-seq signal at the rpr locus and the Ubx locus (H3K27me3-positive control) in GSCs and 32C NCs.

      A sentence of Result section was revised as below.

      Revised Page 17, Line 396; As internal controls, we confirmed H3K9me3 enrichment at the light (lt) locus and H3K27me3 enrichment at the Ultrabithorax (Ubx) locus, consistent with their established chromatin states (Akkouche et al., 2017; Greil et al., 2003; Torres-Campana et al., 2022); relative to these controls, the rpr locus shows H3K9me3 but no clear H3K27me3 enrichment in GSCs.

      Regarding the suggestion to broaden the discussion to include global chromatin reorganization during the germline-to-maternal transition, as reported in Kotb et al., 2024, we agree that this is an important avenue for understanding rpr accessibility. The "string of pearls" phenotype observed in stil mutants rescued with P35 overexpression (Figure S3) is consistent with perturbations during this transition. However, a detailed analysis of such chromatin reorganization and its potential impact on rpr regulation lies beyond the scope of the present study and represents a valuable direction for future work.

      Broader analysis of rpr regulation in somatic cells: It would be informative to examine publicly available chromatin or transcriptional data for the rpr locus in somatic ovarian cells. This could help clarify whether rpr regulation by Stil is truly germline-specific or reflects broader developmental trends. This will also clarify why the flies are homozygous viable but female sterile.

      Response:

      We thank the reviewer for this insightful suggestion. We agree that exploring chromatin accessibility and transcriptional regulation at the rpr locus in somatic ovarian cells would provide valuable insights into tissue- or cell-type-specific chromatin environments that influence rpr expression.

      However, to our knowledge, there are currently no publicly available ATAC-seq or comparable chromatin datasets for purified ovarian somatic cells, including follicle cells or ovarian somatic cells (OSCs). As such, we are unable to incorporate this analysis in the current study. Nevertheless, we fully recognize the importance of this line of inquiry and consider it a valuable direction for future research.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript describes the characterization of stand still (stil), a previously identified gene needed for germ cell survival in Drosophila. The molecular function of Stil has until now remained poorly understood. This new work shows that loss of stil results in reaper (rpr)-dependent apoptosis within female germ cells. Loss of rpr suppresses many of the phenotypes observed in stil mutants. Experiments performed using Drosophila cell culture suggest that Stil binds to elements within the rpr promoter. DamID and structure/function experiments indicate that Stil likely directly represses the transcription of rpr within germ cells.

      In general, the experiments are well executed, and the data largely support the basic claims of the authors. Replicates are included and appropriate statistical analyses have been provided. The text and figures clear and accurate. Appropriate references were cited. There are a few things the authors should address or rephrase before publication.

      On page 9 line 190-192. The authors state "Altogether, these findings indicate that the loss of stil function not only triggers apoptosis that can be suppressed by apoptosis inhibitors but also causes defects in oogenesis progression that are not rescued by blocking cell death." Failure to rescue defects during mid-oogenesis could be due to insufficient transgene expression. Indeed, loss of rpr appears to rescue the fertility of stil mutants. The conclusions of this section should be restated.

      Response:

      We agree that the failure to rescue mid-oogenesis defects by P35 overexpression may, at least in part, be due to insufficient transgene expression. This explanation is particularly plausible given that loss of rpr more effectively restored fertility in stil mutants. As suggested by the reviewer, we have revised the relevant sentences, to avoid misinterpretation as below.

      Revised Page 9, Line 191; Altogether, these findings indicate that the loss of stil function triggers apoptosis that can be suppressed by apoptosis inhibitors.

      Revised Page 12, Line 253; The complete rescue of germline survival in stil rpr double mutants also suggests that the failure of P35 overexpression to restore mid-oogenesis defects may partly reflect insufficient transgene expression (Fig. S3).

      The authors should present the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil protein. DamID can sometimes give spurious results depending on expression levels. Further discussion along this point is necessary.

      Response:

      We thank the reviewer for raising this issue. As suggested, we have now analyzed the overlap between genes that are differentially expressed in stil mutant ovaries (identified by RNA-seq with stil mutant expressing P35) and genes that are potentially bound by Stil based on DamID-seq data (promoter-proximal peaks {less than or equal to}1 kb) as Supplementary Table 4. The list includes genes with DamID peaks within promoter regions and that also exhibit significant differential expression (|log2FC| > 1, adjusted p The overlap between DamID-seq and RNA-seq comprises 682 genes, including rpr, supporting the idea that Stil regulates rpr expression through interaction with its upstream promoter region. However, the detected peak signal at rpr was 3.41, which was not that strong, suggesting that Stil may also bind to and regulate other genes in female germline cells. Investigating the potential role of Stil in regulating other genes represents an important future direction of our study.

      We have included this analysis and argument in the revised manuscript as below.

      Revised Page 13, Line 280; A total of 682 genes with Stil-enriched peaks detected at promoter regions ({less than or equal to}1 kb) showed significantly altered expression in RNA-seq analysis of stil mutants expressing P35, including rpr (Supplementary Table 4).

      Revised Page 20, Line 440; Notably, the DamID peak intensity at the rpr locus reached 3.41, which is moderate rather than strong (Supplementary Table 4). This suggests that, in addition to repressing rpr, Stil may bind to and regulate other genomic loci in the female germline. Investigating the repertoire of Stil target genes and elucidating their roles in germline cells will be an important future direction of this study.

      For structure function experiments, a western blot showing expression levels of the different transgenes in ovaries should be included.

      Response:

      We thank the reviewer for this helpful comment. To address this point, we examined the expression levels of the four Stil variants (FL, NT, CT, and AAYA) in ovaries driven by a germline driver under a wild-type background using Western blotting. The representative blot and quantification from three biological replicates showed comparable expression levels among the variants, with the CT variant displaying a slightly reduced signal. Importantly, AAYA showed expression comparable to FL yet, like CT, failed to rescue, indicating that the rescue failure is not explained by expression-level differences. These data instead support a requirement for the BED-type zinc finger for Stil function in the germline. While we cannot fully exclude a minor contribution from the slightly lower expression of the CT variant to the lack of rescue, the AAYA result argues that loss of BED-type zinc-finger function is the primary cause; we note this caveat in the revised text. The corresponding data are now presented in Figure S6A of the revised manuscript.

      New Fig. S6A;

      The legend of new Fig. S6A;

      (A) Western blot analysis of 6×Myc-tagged Stil variants (FL, NT, CT, and AAYA) driven by NGT40-Gal4; NosGal4-VP16, with y w as a control. Stil variants were detected with anti-Myc, and α-Tubulin (αTub) served as a loading control. Arrowheads indicate Stil variant proteins. The lower panel shows quantification of the Myc/αTub signal ratio normalized to FL. Error bars indicate standard deviation (s.d.) (n = 3).

      A sentence of Result section was revised as below.

      Revised Page 13, Line 291; The expression of all four Stil variant proteins from the transgenes was confirmed, although Stil-CT showed a slightly reduced expression level (Fig. S6A)

      Revised Page 14, Line 305; Although CT shows slightly lower expression, AAYA fails to rescue despite FL-like expression, indicating that expression level is not limiting and that loss of the BED-type zinc finger underlies the phenotype.

      "With the addition of the new Fig. S6A, the following figure labels have been updated;

      Fig. S6A →S6B

      Fig. S6B → S6C

      Fig. S6C → S6D

      Fig. S6D → S6E

      Individual data points should be shown in each graph in place of simple bar graphs. This type of presentation was inconsistent throughout the paper.

      Response:

      We thank the reviewer for this constructive comment. In line with the reviewer's suggestion, we have revised the relevant graphs to include individual data points overlaid on bar plots with error bars. This modification enables readers to better assess data variability. We also ensured consistency in data presentation among the revised figures while maintaining clarity throughout the manuscript.

      Reference "G & D., 1997" should be properly formatted.

      Page 6 line 117 and 121- a couple of instances where "cell" should be "cells"

      Page 14 line 304- typo "Still"

      Response:

      As suggested, we have revised all figures to display individual data points in each graph instead of using simple bar graphs. This change has been applied consistently throughout the manuscript to improve data transparency and readability. The revised figures include Figure 1A, 2B, S1A, and S2A.

      We have also corrected the following textual issues;

      ・The reference "G & D., 1997" has been properly formatted as "Pennetta & Pauli, 1997".

      ・On page 6, lines 119 and 123, "cell" has been corrected to "cells" to ensure grammatical accuracy.

      ・On page 14, line 315, the typo "Still" has been corrected to "Stil".

      Reviewer #3 (Significance (Required)):

      The significance of the work lies in characterizing a previously unknown function of Stil. By showing that Stil acts to repress transcription of the cell death gene rpr, the authors provide new insights into how programmed cell death is regulated in the Drosophila female germline. Readers interested in reproductive biology, cell death, chromatin, and general developmental biology will find value in these new findings.

      One thing to consider is the possibility that Stil represses rpr in the context of the double strand breaks that form during meiosis. Experiments in the paper indicate that stil knockdown results in TUNEL labeling in region 2A/2B of the germarium. The authors should consider co-labeling for a meiosis marker (C(3)G or gammaH2Av) to see if this PCD correlates with this expression. In addition, they could test whether loss of Spo11 (mei-W68) suppresses stil phenotypes during early germ cell development. Relating the function of Stil to repression of cell death during this critical time of germ cell development would elevate the impact and significance of the paper. However, this may be considered beyond the scope of the current study.

      Response:

      We deeply thank the reviewer for this insightful and thought-provoking suggestion.

      As suggested, we conducted co-staining with γH2Av (DBS marker), as well as genetic interaction experiments with Spo11 (mei-W68) mutants to address this question shown below. In region 2 across all genotypes including y w control, and stil heterozygous and homozygous ovaries expressing P35, γH2Av signals were discernible and subsequently lost in region 3 through the meiotic recombination-specific DNA repair program (Additional Figure A). In stil mutants, however, an additional strong γH2Av signal was specifically observed in the oocyte, beyond the expected meiotic pattern. Furthermore, loss of meiotic recombination factors, including mei-W68, in stil mutants partially rescued the germline loss phenotype, although not to the same extent as in rpr mutants (Additional Figure B, C: 43.5 % in mei-W68-GLKD, 23.9 % in mei-P22P22 and 12.8 % in vilya826 versus 100 % with loss of rpr in Fig. 3E, F of the revised manuscript). These findings suggest that accumulation of meiotic DSBs is not the main cause of rpr upregulation in stil mutants. We feel that these analyses are beyond the scope of the current study, which focuses on identifying Stil as a transcriptional repressor of rpr and characterizing its role in germline apoptosis. Elucidating other mechanisms that elevate rpr expression in stil mutants will be the focus of future work. Hence, we are providing these data here for the reviewer's reference, but if the reviewer prefers, we would be happy to incorporate them into the manuscript.

      Additional Figure (A) Immunostaining of ovarioles from y w, stilEY16156/CyO; P35 OE (NGT40; NosGal4-VP16> P35), stilEY16156; P35 OE flies with antibody against DNA double-strand break marker H2Av (green), Vasa (red), and DAPI (blue). Insets show enlarged views of egg chamber. White dots indicate oocyte nuclei, Scale bar: 50 μm (ovariole) and 20 μm (egg chamber). (B) Immunofluorescence of Vasa (red) and DAPI (blue) in ovaries from stilEY16156, stilEY16156; mei-W68-GLKD (driven by NGT40; NosGal4-VP16), stilEY16156; meiP22P22, and stilEY16156; vilya826. Scale bar: 50 μm. (C) Quantification of the percentage of ovarioles containing germline cells in 2-3-day-old females. The genotypes of females are indicated below the x-axis, and the number of germaria analyzed is shown above each bar. Error bars represent the standard deviation (s.d.).

      Akkouche, A., Mugat, B., Barckmann, B., Varela-Chavez, C., Li, B., Raffel, R., Pélisson, A. & Chambeyron, S. (2017). Piwi Is Required during Drosophila Embryogenesis to License Dual-Strand piRNA Clusters for Transposon Repression in Adult Ovaries. Molecular Cell, 66(3), 411-419.e4. https://doi.org/10.1016/j.molcel.2017.03.017

      Greil, F., Kraan, I. van der, Delrow, J., Smothers, J. F., Wit, E. de, Bussemaker, H. J., Driel, R. van, Henikoff, S. & Steensel, B. van. (2003). Distinct HP1 and Su(var)3-9 complexes bind to sets of developmentally coexpressed genes depending on chromosomal location. Genes & Development, 17(22), 2825-2838. https://doi.org/10.1101/gad.281503

      Röper, K. & Brown, N. H. (2004). A Spectraplakin Is Enriched on the Fusome and Organizes Microtubules during Oocyte Specification in Drosophila. Current Biology, 14(2), 99-110. https://doi.org/10.1016/j.cub.2003.12.056

      Torres-Campana, D., Horard, B., Denaud, S., Benoit, G., Loppin, B. & Orsi, G. A. (2022). Three classes of epigenomic regulators converge to hyperactivate the essential maternal gene deadhead within a heterochromatin mini-domain. PLoS Genetics, 18(1), e1009615. https://doi.org/10.1371/journal.pgen.1009615

      Xie, T. & Spradling, A. C. (1998). decapentaplegic Is Essential for the Maintenance and Division of Germline Stem Cells in the Drosophila Ovary. Cell, 94(2), 251-260. https://doi.org/10.1016/s0092-8674(00)81424-5

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript describes the characterization of stand still (stil), a previously identified gene needed for germ cell survival in Drosophila. The molecular function of Stil has until now remained poorly understood. This new work shows that loss of stil results in reaper (rpr)-dependent apoptosis within female germ cells. Loss of rpr suppresses many of the phenotypes observed in stil mutants. Experiments performed using Drosophila cell culture suggest that Stil binds to elements within the rpr promoter. DamID and structure/function experiments indicate that Stil likely directly represses the transcription of rpr within germ cells.

      In general, the experiments are well executed, and the data largely support the basic claims of the authors. Replicates are included and appropriate statistical analyses have been provided. The text and figures clear and accurate. Appropriate references were cited. There are a few things the authors should address or rephrase before publication.

      On page 9 line 190-192. The authors state "Altogether, these findings indicate that the loss of stil function not only triggers apoptosis that can be suppressed by apoptosis inhibitors but also causes defects in oogenesis progression that are not rescued by blocking cell death." Failure to rescue defects during mid-oogenesis could be due to insufficient transgene expression. Indeed, loss of rpr appears to rescue the fertility of stil mutants. The conclusions of this section should be restated.

      The authors should present the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil protein. DamID can sometimes give spurious results depending on expression levels. Further discussion along this point is necessary.

      For structure function experiments, a western blot showing expression levels of the different transgenes in ovaries should be included.

      Individual data points should be shown in each graph in place of simple bar graphs. This type of presentation was inconsistent throughout the paper.

      Reference "G & D., 1997" should be properly formatted. Page 6 line 117 and 121- a couple of instances where "cell" should be "cells" Page 14 line 304- typo "Still"

      Referee cross-commenting

      I also agree with the points raised by the other two reviewers. I think we are in general agreement on the strengths and weaknesses of the study.

      Significance

      The significance of the work lies in characterizing a previously unknown function of Stil. By showing that Stil acts to repress transcription of the cell death gene rpr, the authors provide new insights into how programmed cell death is regulated in the Drosophila female germline. Readers interested in reproductive biology, cell death, chromatin, and general developmental biology will find value in these new findings.

      One thing to consider is the possibility that Stil represses rpr in the context of the double strand breaks that form during meiosis. Experiments in the paper indicate that stil knockdown results in TUNEL labeling in region 2A/2B of the germarium. The authors should consider co-labeling for a meiosis marker (C(3)G or gammaH2Av) to see if this PCD correlates with this expression. In addition, they could test whether loss of Spo11 (mei-W68) suppresses stil phenotypes during early germ cell development. Relating the function of Stil to repression of cell death during this critical time of germ cell development would elevate the impact and significance of the paper. However, this may be considered beyond the scope of the current study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this well-executed study, Matsui et al. investigate how the female Drosophila germline prevents inappropriate apoptosis during development. They identify stand still (stil) as a key germline-specific repressor of apoptosis. Stil mutant flies are homozygous viable but female sterile due to widespread germ cell loss at the time of eclosion, which is driven by activation of the pro-apoptotic gene reaper (rpr) and caspase-dependent cell death. Germline-specific expression of anti-apoptotic factors such as p35 can rescue this phenotype, confirming that the defect lies in apoptotic regulation. The authors show that Stil directly represses rpr transcription through its BED-type zinc finger domain. Notably, undifferentiated germline cells remain resistant to apoptosis in the absence of stil, which the authors attribute to a silenced chromatin state at the rpr locus, marked by H3K9me3. These findings support a dual mechanism of protection: transcriptional repression of rpr by Stil, and a potential parallel chromatin-based silencing mechanism operating specifically in undifferentiated cells.

      Major Issues:

      1. Clarify cell identity in Figure 2E: It is unclear whether the apoptotic cells shown are somatic or germline in origin. Including a somatic marker such as 1B1 would allow the reader to clearly distinguish the apoptotic population and better interpret the figure.
      2. Quantification of undifferentiated cells in mutants: There appears to be inconsistency in the representation of undifferentiated germ cells across figures. Early panels show near-complete germline loss, while later analyses focus on undifferentiated cells that are reportedly apoptosis-resistant. The authors should quantify the proportion of ovarioles retaining undifferentiated cells and present this data in Figure 1 or the supplements to resolve this discrepancy.
      3. Interpretation of chromatin state at the rpr locus: The claim that H3K9me3, but not H3K27me3, marks the rpr locus is not fully convincing given the low ChIP-seq signal shown. Including a comparison to a known positive control locus would strengthen the argument. Alternatively, the authors could broaden the discussion to include global chromatin reorganization during germ cell to maternal transition, as reported in Kotb et al., 2024 and how such changes may impact rpr accessibility. Also stl mutant rescued with P53 have a "string of pearls" phenotype that are associated with germ cell to maternal transition defects (Figure S3, p53 OE)
      4. Broader analysis of rpr regulation in somatic cells: It would be informative to examine publicly available chromatin or transcriptional data for the rpr locus in somatic ovarian cells. This could help clarify whether rpr regulation by Stil is truly germline-specific or reflects broader developmental trends. This will also clarify why the flies are homozygous viable but female sterile.

      Referee cross-commenting

      I agree with the assessment of the other two reviewers. I think reviewer 3 point of "the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil" is important and needs to be addressed.

      Significance

      This study provides important insight into how germline cells in Drosophila evade apoptosis through both transcriptional and chromatin-based regulation. While reaper is a well-known effector of apoptosis, the identification of stil as a direct repressor in the female germline adds a new layer of cell type-specific control. The authors also delineate an epigenetic mechanism that protects undifferentiated germline cells, highlighting stage-specific differences in apoptotic susceptibility. This dual mechanism is conceptually significant and expands our understanding of how cell survival is maintained during gametogenesis. However, the precise novelty of stil relative to other rpr regulators could be articulated more clearly, and some data interpretations would benefit from additional clarification.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This work by Matsui et al. examined the function of a gene Stand Stil (stil) in Drosophila in regulation of germ cell death in the female germline. They show that stil mutants contain many apoptotic cells, leading to germ cell loss and infertility. Gene expression analysis showed upregulation of pro-apoptotic genes such as rpr in stil mutant. DamID experiment further showed that stil binds to rpr promoter region to repress its expression. Additionally, they also show that undifferentiated germ cells are resistant to cell death in stil mutant (but stil mutant still eventually loses all germ cells).

      Major comments: Overall, experiments adhere to a general standard of rigor, and each result is fairly convincing. In that sense, this paper warrants publication, as a paper that revealed a new gene important for preventing germ cell death. With that said, I feel that this paper does not reveal a new biological insight. In a nutshell, this paper is about a transcriptional repressor for pro-apoptotic gene, hence its depletion leads to cell death. Data is solid and the conclusion is well supported. But the readers will be left wondering why nature implemented such control? Unless one can show what kind of defects stil rpr double mutant (which rescues germ cell loss phenotype) exhibits, there is no insight why the balance of pro-apoptotic gene and its repressor is important. The paper discusses the 'molecular' mechanisms that explain the phenomenon, but it does not provide insights. The lack of conceptual advancement is the limitation of this work.

      Minor comments: Although this is a minor point, and this is not specifically pointing a finger at the author of this paper, I really don't like the term 'safeguard'. This term is now overutilized to add hype to papers, when 'is necessary' is sufficient. In this case, unless the answer is provided as to 'against what stil is safeguarding germ cells', this term is not meaningful. For example, if one can show that stil specifically senses germline-specific threat and tweaks the regular apoptotic pathway based on germline-specific needs, then the term 'safeguard' may be warranted.

      Referee cross-commenting

      I also agree with other reviewers.

      Significance

      As I summarized above, as is, this manuscript's impact is limited to identifying a gene that is required to prevent germ cell death.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-Point Response to Reviewers for Manuscript #RC-2024-02720

      Manuscript Title: Molecular and Neural Circuit Mechanisms Underlying Sexual Experience-dependent Long-Term Memory in Drosophila.

      Corresponding Author: Woo Jae Kim

      We extend our sincere gratitude to the Managing Editor and both reviewers for their diligent and insightful evaluation of our manuscript. The comprehensive feedback provided has been invaluable, guiding us to significantly strengthen the manuscript's scientific rigor, logical cohesion, and overall impact. We have undertaken a substantial revision, incorporating new experimental evidence, reframing the central narrative, and improving data presentation to address all concerns raised.

      The major revisions include:

      1. New Experimental Evidence: We have performed three new sets of experiments to address key questions raised by the reviewers. First, we used the protein synthesis inhibitor cycloheximide to pharmacologically validate that the observed memory is indeed a form of long-term memory (LTM). Then, we performed genetic intersectional analyses to determine if the identified Yuelao (YL) neurons express the canonical sex-determination transcription factors doublesex (dsx) and fruitless (fru).
      2. Narrative Reframing and Logical Restructuring: We fully agree with the reviewers that the logic of the original manuscript was confusing, particularly regarding the distinction between the broad Mushroom Body (MB) Kenyon Cell (KC) population and the specific YL neurons. The manuscript has been extensively rewritten to present a clear, hypothesis-driven narrative. We now frame the initial KC-related findings as part of a broader screening effort that logically led to the identification and focused investigation of the YL neuron circuit.
      3. Refined Central Claim: Guided by the reviewers' feedback and our new data, we have sharpened our central claim. We now propose that YL neurons constitute a critical circuit for forming attractive taste- and pheromone-based memories derived from Gr5a neuronal inputs. This form of appetitive memory is distinct from the previously characterized internal reward state associated with ejaculation, adding a new layer to our understanding of how male flies remember and evaluate reproductive experiences.
      4. Improved Data Quality and Analysis: In response to valid critiques, all imaging figures have been replaced with high-resolution versions. Furthermore, our methods for fluorescence quantification, particularly for the TRIC calcium imaging experiments, have been corrected to include normalization against an internal reference channel, adhering to established best practices. All requested genetic control experiments have been performed. We are confident that these comprehensive revisions have fully addressed all concerns and have transformed our manuscript into a much stronger, more focused, and logically sound contribution. We thank you again for the opportunity to improve our work and look forward to your evaluation of the revised manuscript.

      Responses to Reviewer #1

      General Comments: This study explores the molecular and neural circuitry mechanisms underlying sexual experience-dependent long-term memory (SELTM) in male Drosophila. The authors use behavioral, imaging, and bioinformatics approaches to identify YL neurons, a subset of mushroom body (MB) projecting neurons, as crucial for SELTM formation. They propose that YL neurons receive inputs from WG neurons via the sNPF-sNPFR pathway and implicate molecular players such as orb2, fmr1, MDAR2-CaMK, and synaptic plasticity in their function.

      However, the evidence presented does not adequately support the authors' claims. The data fail to cohesively tell a logical story, and key conclusions appear to be based on assumptions and correlations rather than robust evidence.

      • Answer: We are deeply grateful to both reviewers for their thorough and constructive evaluation of our manuscript. Their collective feedback has been instrumental in helping us to clarify the study's rationale, strengthen our interpretations, and significantly improve the overall quality and impact of the work. We appreciate the recognition of our study's potential to advance the understanding of how sexual experience modifies future mating behaviors and to elucidate the neuronal and molecular mechanisms of how memory regulates a key sexual behavior in male Drosophila*.

      • *In response to the general comments, we have undertaken a major revision of the manuscript to improve the clarity, logic, and presentation. We have rewritten the Abstract and Introduction to more clearly define "sexual experience-dependent long-term memory" (SELTM) and articulate its significance in the context of adaptive decision-making and interval timing. The entire manuscript has been restructured to present a more logical, hypothesis-driven narrative that clearly distinguishes our initial broad screening from the focused investigation of the YL neuron circuit. We have also incorporated alternative interpretations of our data, particularly regarding the role of the YL circuit in regulating baseline mating duration in naive males, which has added more depth to the study. Finally, all figures have been remade in high resolution, and all requested genetic controls and methodological clarifications have been added to ensure rigor and reproducibility. We are confident that these revisions have addressed the reviewers' concerns and have resulted in a much stronger manuscript.

      Comment 1: The study identifies the knowledge gap (lines 103-104) but fails to integrate relevant literature, particularly Shohat-Ophir et al., Science (2012), and Zer-Krispil et al., Curr Biol (2018). These studies established that ejaculation induces appetitive memory in male Drosophila via corazonin and NPF neurons. The current study does not provide direct evidence that the "act of mating itself" drives SELTM, as it includes both courtship and copulation.

      Response: Thank you for highlighting these two landmark studies. We fully agree that Shohat-Ophir et al., Science (2012) and Zer-Krispil et al., Curr Biol (2018) were pivotal in demonstrating that ejaculation—and the accompanying corazonin/NPF signalling—can establish an appetitive memory in males.

      In the revised manuscript we have now integrated both papers on lines 111-118:

      “Previous work has shown that successful copulation is intrinsically rewarding to male Drosophila: a single mating encounter elevates brain neuropeptide F (NPF) levels and suppresses subsequent ethanol preference19. Importantly, Zer-Krispil et al. further demonstrated that ejaculation itself—artificially induced by optogenetic activation of corazonin (Crz) neurons—is sufficient to mimic this reward state, driving appetitive memory formation and up-regulation of NPF. These findings indicate that the act of ejaculation, rather than the entire courtship sequence, is the critical sensory event that gates post-mating reward.”

      Comment 2: The nature of the observed long-lasting reduced mating duration requires clearer characterization: Is this an associative memory or experience-dependent behavioral plasticity? Can the formation of this long-term memory be blocked by protein synthesis inhibitors, such as cycloheximide?

      Response: We thank the reviewer for this excellent suggestion to pharmacologically characterize the nature of the memory. To definitively test whether the observed SMD is a form of protein synthesis-dependent long-term memory (LTM), we performed a new experiment as suggested.

      We have now included data in new Figure supplement 1I showing that feeding males the protein synthesis inhibitor cycloheximide (CXM) for 24 hours immediately following the sexual experience completely blocks the formation of the long-lasting SMD phenotype. Control flies fed a vehicle solution exhibited robust SMD. This result provides strong evidence that SELTM is not merely a form of transient behavioral plasticity but is a genuine form of LTM that requires de novo protein synthesis for its consolidation, a hallmark of LTM across species.[1]

      The revised text was put on lines 173-176:

      " To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I)."

      Comment 3: While schematics illustrate the working hypotheses, the text lacks detailed explanations, leaving the reader unclear about the rationale behind certain conclusions.

      __Response: __Thank you very much for this insightful comment. We fully agree that the original manuscript did not provide sufficient textual justification for the conclusions derived from the schematics. In the revised version we have therefore added comprehensive explanations immediately following each figure (or schematic) that explicitly state the underlying rationale, the key observations supporting our hypotheses, and the logical steps leading to each conclusion. We believe these additions now make the reasoning transparent and easy to follow. We appreciate your feedback, which has substantially improved the clarity of our work.

      • *

      Comment 4*: The logic to draw certain conclusions was confusing and misleading. - For instance, the role of orb2 in SELTM is examined via knockdown in MB Kenyon cells (KCs) (using ok107>orb2-RNAi), which is irrelevant to the claim that orb2 functions in YL neurons. Additionally, RNAseq analyses (Fig. 1N-S) focusing on orb2 expression in a/b KCs are irrelevant to and cannot support the claim that Orb2 functions in YL neurons. *

      *- Similarly, the claim (lines 302-303) that sNPF-R expression is exclusive to MB KCs conflicts with data showing effects when sNPF-R is knocked down in YL neurons. How can knocking-down a gene, which is exclusively expressed in neural population A, in neural population B affect a phenotype? This inconsistency undermines the interpretation of the results. *

      *- Other examples include lines 223-227 and lines 246-249. It is very confusing how the authors came to the indications. *

      - The authors also kept confusing the readers and themselves by mistakenly referring to MB KC a-lobe and YL a-lobe projection. They may know the difference between the two neural populations but they did not always refer to the right one in the text.

      Response: We agree completely with the reviewer that the logic in the original manuscript was confusing and failed to clearly distinguish between the general MB Kenyon Cell (KC) population and the specific YL projection neurons. This was a major flaw, and we are grateful for the opportunity to correct it. We have undertaken a major revision of the manuscript's narrative and structure to present a clear, logical progression of discovery.

      The new logical flow of the manuscript is as follows:

      1. We first establish that sexual experience induces a robust, long-lasting SMD behavior that is dependent on protein synthesis
      2. We then perform initial experiments to implicate the MB as a key brain region. We show that broad inhibition of MB KCs (using the ok107-GAL4 driver) disrupts SMD behavior.This result establishes the general involvement of the MB but lacks cellular specificity.
      3. The remainder of the manuscript then focuses specifically on dissecting the molecular and cellular properties of these YL neurons. Finally, we have meticulously edited the entire manuscript to ensure that we always use precise terminology, clearly distinguishing between "YL neuron projections to the MB α-lobe" and the "MB KC α-lobe."

      Comment 5*: The imaging figures provided are unfocused and poorly resolved, making it difficult to assess data quality. *

      *- Colocalization analyses of orb2 and YL are unconvincing... Maximum intensity projection images are insufficient... complete image stacks with staining of orb2, YL, and KCs (MB-dsRed) are needed for validation. *

      - Quantification of imaging data appears flawed. For example, claims of orb2 and CaMKII upregulation in MB a-lobe projections (e.g., Fig. S2F-J, Fig. 3M,N) are confounded by widespread increases in intensity across the brain, lacking specificity.

      • *

      *- The TRIC experiment analysis should normalize GFP signals to internal reference channel (RFP in the TRIC construct)... *

      - In Fig. 6H-J, methods for counting synapse numbers are not described. How are synapse numbers counted in these low-resolution images?

      Response: We sincerely apologize for the poor quality of the imaging data presented in the original manuscript. We agree with the reviewer's critiques and have taken comprehensive steps to rectify these issues.

      • Image Quality: We apologize for not including the full image data in the original submission. The complete figure is now presented in revised Fig. 2J .
      • Fluorescence Quantification: The fluorescence quantification has been re-analyzed. The Methods section now includes a detailed description of our protocol.
      • TRIC Normalization: We apologize for not stating this explicitly in the previous version. As now described in the revised Methods subsection “Quantitative Analysis of Fluorescence Intensity”, all TRIC images were acquired with identical laser power and exposure settings. The GFP signal was background-corrected and then normalized to the RFP fluorescence encoded by the TRIC construct itself (UAS-mCD8RFP), which serves as an internal reference for construct expression and mounting thickness.
      • Synapse Counting: We agree with the reviewer that the resolution of our images was insufficient for accurate synapse particle counting. We have therefore removed the problematic analysis from the former Fig 6H-J. Our conclusions regarding synaptic plasticity now rest on the more robust and quantifiable data showing a significant increase in the total area of dendritic (DenMark) and presynaptic (syt.eGFP) markers. Comment 6: The study presents data from unrelated learning paradigms (e.g., olfactory associative learning, courtship conditioning; Fig. 7) without justifying how these paradigms relate to SELTM. Particularly, the authors claimed that SELTM is related to Gr5a, which leads to appetitive memories, which involve PAM dopaminergic neurons and MB horizontal lobes. However, the olfactory associative learning with electric shock and courtship conditioning lead to aversive memories, that involve PPL1 dopaminergic neurons and the vertical lobes.

      • *

      Response: We thank the reviewer for requesting clarification on the rationale for including these experiments. The purpose of these assays was to test the specificity of the YL neuron circuit. A key question is whether YL neurons represent a general-purpose LTM circuit or one specialized for a particular memory modality.

      The data show that knockdown of Orb2 or Nmdar2 specifically in YL neurons has no effect on the formation of LTM for aversive olfactory conditioning or aversive courtship conditioning. These negative results are critically important, as they demonstrate that the YL circuit is

      not required for all forms of LTM. This finding strongly supports our revised central claim that YL neurons are specialized for processing appetitive memories derived from the specific sensory context of mating (i.e., taste and pheromonal cues from Gr5a neurons).

      To improve the narrative flow of the main text, We rearranged the order of the articles. The relevant description is in lines 398-401:

      “To determine whether YL neurons constitute a general LTM circuit or are dedicated to the appetitive context of mating, we tested two canonical aversive paradigms: electric-shock olfactory conditioning and courtship conditioning. If YL neurons serve as a universal LTM module, their genetic impairment should also impair aversive memory.”

      lines 469-472:

      “The inability of YL perturbation to impair aversive memories (Fig. 7) corroborates that this micro-circuit is dedicated to Gr5a-dependent SELTM rather than acting as a generic LTM hub”

      Minor Issues

      Comment 1: Fig 2F. YL projections are labeled as MBONs. Clarify whether YL neurons are the upstream or downstream (MBON) of KCs.

      __Response: __Thank you for this helpful comment. As Huang et al., 2018[2] (Nat. Commun. 9:872) have mentioned, the MB093C-GAL4 driver is the MBON-α3 mushroom body output neuro. Consequently, YL neurons are positioned downstream of the MBON-α3.

      We have now clarified this point in the revised manuscript lines 217-222:

      “Each of these neurons extends a vertical fiber to the dorsal brain region, where they form dense arbors within the α-lobes of the mushroom body. Because the MB093C-GAL4 driver labels MBON-α3 output neuron[51], these YL arbors are positioned postsynaptically within the α-lobe and relay mushroom-body output to the anterior, middle, and posterior superior-medial protocerebrum.”

      Comment 2: Extensive language polishing is required, as several sentences are unclear (e.g., lines 169-172).

      Response: We apologize for the lack of clarity in the original text. The entire manuscript has undergone extensive revision and professional language editing to improve readability, precision, and grammatical accuracy.

      Responses to Reviewer #2


      Major Comments

      Comment 1: Clearer articulation of the rationale, motivation, and significance of the overall study design and individual experiments can strengthen the manuscript and promote readership. For example, the beginnings of the abstract and introduction should define what authors mean by sexual experience-dependent long-term memory and its significance (including why it is "significant for reproductive success" (lines 46 and 92)). Similarly, employing more concrete language throughout the text will help anchor and contextualize the study. Interpretation is occasionally insufficient or does not follow directly from the data provided.

      Response: We thank the reviewer for this valuable advice. We agree that the motivation and significance of our study were not articulated clearly enough. We have rewritten the Abstract and the beginning of the Introduction to address this. The revised text now explicitly defines SELTM as a protein synthesis-dependent, appetitive memory formed in response to gustatory and pheromonal cues. We explain its significance in the context of adaptive behavior, linking it to interval timing, a process by which male flies strategically adjust their mating investment (i.e., mating duration) based on prior experience to optimize reproductive success and energy expenditure. This framing provides a clearer context for our investigation into its underlying neural and molecular mechanisms.

      Comment 2: Long term memory: I do not work on Drosophila memory, but a cursory search suggests that the field generally considers long term memory in Drosophila to last for 24 hr to days (courtship memory lasts for >24 hr). SMD decays between 12-24 hr after copulation. Could SMD be considered a short-term effect?

      Response: This is an important point of clarification, as described in our response to Reviewer #1 (Major Comment 2), we have performed a new experiment demonstrating that the formation of SMD is blocked by the protein synthesis inhibitor cycloheximide (Figure 1I). This dependence on de novo protein synthesis is a defining characteristic of LTM, distinguishing it from short- and intermediate-term memory forms.[1] where memories lasting 12-24 hours are well-established as forms of LTM.[3] Therefore, based on both its duration and its molecular requirements, SMD represents a bona fide form of LTM.

      The relevant statement is in lines 174-178:

      "To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I). This finding suggests that the reduction in mating investment is contingent upon the formation of LTM."

      Comment 3: Fig 1B-E share the same control (naive) group. If these experiments were performed in the same replicate(s), they should be plotted in the same figure. If not, please provide more details on how experimental blocks were set up and how controls compared between replicates.

      Response: Thank you for this helpful suggestion. We understand that sharing the same naive control across multiple panels (Fig. 1B–E) may raise concerns about data independence. However, we chose to present these panels separately for the following reasons:

      1. Clarity and Readability: Each panel (B–E) represents a distinct temporal condition (0 h, 6 h, 12 h, 24 h post-isolation). Separating them avoids visual clutter and allows readers to focus on one time point at a time, improving interpretability.

      __ Consistency with Internal Controls:__

      Although the naive group is identical across panels, each experimental block (i.e., each isolation time point) was run independently on same days, with internal controls (naive vs. experienced) included in every block. This ensures that statistical comparisons remain valid within each panel, even if the naive data overlap.

      We have now added a clear statement in the figure legend explaining that the naive group is shared across panels and that each time point was tested independently with internal controls. This maintains transparency while preserving the visual clarity of the current layout.

      Comment 4: Serial mating (Fig 1F-H): please provide details on the methods. How much time elapsed between successive matings? Is a paired statistical test used? Sperm depletion also affects mating duration, and without this information the authors' conclusion (lines 155-156) does not automatically follow from the data.

      Response:

      1. __ Interval between successive matings__ We have rewritten the Methods to state explicitly that “as soon as one copulation ended the male was transferred immediately to a fresh virgin female, so the next mating began immediately.”

      we add new method:

      " Serial mating ____duration ____assay

      Serial mating duration assay was identical to the standard procedure except that each male was presented with four DF virgin females in immediate succession: upon termination of the first copulation the male was immediately put into a fresh chamber containing the next virgin, the timer was restarted at first contact, and this step was repeated until four complete matings were recorded or 5 min elapsed without initiation, whichever came first."

      __ Statistical test__

      We apologize for omitting this detail. Unpaired t-test was used: for male the mating duration before (naïve) and after sexual experience was recorded, yielding paired observations. Prism’s unpaired t-test module was therefore applied to evaluate the mean difference.

      The figure legend now states “with error bars representing SEM. Asterisks represent significant differences, as revealed by the Unpaired t test and ns represents non-significant difference (**p __ Mating duration versus sperm depletion__

      We apologize for not having made it clear that these two observations are complementary, not contradictory. Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively [4]

      The revised text is as follows (lines235-241):

      "Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively. This dissociation confirms that the constant mating duration we observe in our serial-mating experiment (Fig. 1F–H) is consistent with normal sperm depletion and does not compromise the conclusion that the experience-dependent reduction in mating duration reflects long-term memory."

      Thank you for helping us improve the clarity of our study.

      Comment 5: Mating duration assay: Which isolation interval was chosen for the rest of the SMD experiments? The 12 hr en masse mating setup is relatively uncommon among studies on courtship/copulation/post-copulatory phenotypes, and introduces uncertainty and variability in the number and timing of matings that occurred during the 12 hr-window. This source of variability and its implication in interpreting the data should be acknowledged. Moreover, the 3 studies referenced in the methods all house males in groups of 4, whereas this study uses groups of 40. Could density confound the manifestation of SMD?

      Response: We thank the reviewer for these important methodological questions.

      • Isolation Interval: We have clarified in the Methods that virgin females were introduced into vials for last 1 day before assay.
      • Housing Density: This is an excellent point. To control for any potential effects of housing density itself, we have clarified that our "naive" control males are also housed in groups of 40 for the same duration as the "experienced" males. Therefore, the only difference between the two groups is the presence of females, isolating the effect of sexual experience from the effect of social density. Comment 6: SMD behavior: comparing orb2 mutants and controls (Fig 1M and Fig S1K-L), loss of orb2 actually reduces the mating duration in native males (mean ~15 min) relative to controls (~20 min), and have possibly no effect on experienced males (~15 min). This is inconsistent with the SMD behavior demonstrated in Fig 1B-E. The same pattern is found for mushroom body silencing (Fig 1P, Fig S1M-N), orb2 knockdown in YL neurons (Fig 2D, Fig S2A-B), Fmr1 knockdown in YL neurons (Fig 3D, Fig S2B, S3D) and most other experiments where mating duration is not significantly different between naive and experienced males. This might demonstrate a separate role of YL neurons and its related circuit in regulating mating duration in naive males. Could the authors discuss this interpretation? As an aside, plotting genetic controls next to experimental groups is customary and facilitates comparisons between relevant groups.

      Response: Thank you very much for this insightful observation.

      1. Baseline differences among genotypes We agree that absolute mating duration differs slightly between genotypes (e.g. naive orb2∆/+ about 15 min vs. wild-type CS about 20 min). Such differences are common when mutations or transgenes are introduced into distinct genetic backgrounds, and they do not affect the within-genotype comparison that is the essence of SMD (sexual-experience-dependent shortening of mating duration). Therefore, for every experiment we compared naive vs. experienced males of the identical genotype, keeping all other variables constant.

      Consistency of SMD across figures

      In every manipulation that disrupts SMD memory (orb2∆, MB silencing, orb2-RNAi in YL neurons, Fmr1-RNAi in YL neurons, etc.) the naive–experienced difference disappears, whereas the genetic controls retain a significant ΔMD. This is fully consistent with Fig. 1B–E and demonstrates that the memory trace, not the basal duration, is abolished.

      Figure layout

      Following your suggestion, we have re-ordered all bar graphs so that the relevant genetic controls are placed immediately adjacent to the experimental groups, making within-panel comparisons easier.

      We hope these clarifications and adjustments address your concerns.

      Comment 7: Bitmap figures: unfortunately the bitmap figures are compressed and their resolution makes it difficult to evaluate the visual evidence.

      Response: We apologize for the poor quality of the figures. All figures in the revised manuscript, including the scRNA-seq plots, have been remade as high-resolution vector graphics to ensure clarity and detail. For better understanding, different colored illustrations are also placed next to the scRNA-seq.

      Comment 8: Sexual dimorphism of YL neurons: many neurons involved in sexual behaviors express dsx and/or fru. Do YL neurons express them?

      Response: This is an excellent question. To address it, we performed a new set of experiments using genetic intersectional tools to test for the expression of doublesex (dsx) and fruitless (fru) in YL neurons. Our analysis, presented in figure supplement 2B, reveals that YL neurons are indeed fru-negative and dsx-negative. We therefore conclude that YL neurons do not belong to the canonical fru- or dsx-expressing neuronal classes and are unlikely to be intrinsically sex-specific.

      We add explanation in lines 223-229:

      "Our further analysis confirmed the presence of only three pairs of nuclei near the SOG in male brains, whereas female brains exhibit a greater number of nuclei near the AL (Fig. 2I), suggesting subtle sexual dimorphisms in GAL4MB093C-expressing neurons. Importantly, these neurons do not overlap with either fru- or dsx-expressing cells: co-immunostaining for GFP and Fru or Dsx revealed almost no colocalization in any brain region examined (Fig. S2B), indicating that YL neurons are distinct from the canonical sex-specific fru/dsx circuits."

      Comment 9: Genetic controls for some crucial experiments are not provided, e.g. Fig 2J, Fig S3C, Fig S3E-F Fig 5B-C, F, Q-R, Fig S5A-E.

      Response: We thank the reviewer for their careful attention to detail. We have now performed all the missing genetic control experiments.

      Comment 10: Colocalization experiments: please provide more detail on how fluorescence is normalized for each channel across images, especially when the overall expression of an effector is up- or down-regulated after mating.

      Response: We have updated the Methods section under "Quantitative Analysis of Fluorescence Intensity" and "Colocalization Analysis" to provide a detailed description of our normalization procedure.

      Comment 11: Please resolve this apparent contradiction on the expression of Nmdar1 and 2 in YL neurons. On line 261: "both receptors co-expressing in Orb2-positive MB Kenyon cells"; on line 279-281 "Nmdar1 is not expressed with YL neurons [...] whereas Nmdar2 is expressed in a single pair of YL neurons..."

      Response: We apologize for this contradiction, which arose from the confusing narrative structure of the original manuscript. As detailed in our response to Reviewer #1 (Major Comment 4), we have reframed the manuscript.

      Comment 12: Particle analysis (Fig 6H-J): experienced males seem to have more synapses but trend towards smaller average size. It would be helpful to show number of synapses and average size as paired data, or show that the total particle area is larger in experienced males.

      Response: We agree with the reviewer that this analysis was inconclusive and potentially misleading due to the limitations of image resolution. As noted in our response to Reviewer #1, we have removed this particle analysis (former Fig 6H-J) from the revised manuscript. Our claim for increased synaptic plasticity is now supported by the more robust measurement of the total fluorescence area of the pre- and postsynaptic markers, which shows a significant increase in experienced males.

      Minor Comments

      We thank the reviewer for their meticulous attention to detail. We have addressed all minor comments as follows:

      Comment 1: 1. Some figures (e.g. Fig 3M-R) and experiments (e.g. oenocyte scRNA-seq) are not referenced in the text. dnc data is shown alongside amn and rut but the rationale for its inclusion is not provided.

      __Response: __Original Fig. 3M-R (now Fig,3 M-O) was referenced on line 283. The rationale for including dnc data (as a canonical memory mutant) is now clarified in the text on lines 187-189:

      "To ask whether the same molecular machinery underlies the SMD that follows sexual experience, we tested three classical memory mutants: dunce (dnc), amnesiac (amn), and rutabaga(rut)."

      Comment 2: Some references might not point to the intended article (e.g. ref 123).

      __Response: __The reference list has been checked and corrected.


      Comment 3. Please plot genetic controls next to experimental genotypes as they are a crucial part of the experiment.


      __Response: __All relevant figures now include plots of genetic controls next to experimental genotypes.

      Comment 4. The "estimation statistics" plots are not necessary since the authors show individual data points. To further enhance data transparency, the authors may consider reducing the alpha and/or dot size so the individual data points are more readily visible.

      Response: Thank you for this helpful suggestion! We fully agree that data transparency is essential. After carefully testing lower alpha values and smaller dot sizes, we found that either change markedly obscured the dense regions of the distributions. So we didn't change the size of the point.

      The estimation-statistics overlays are kept for two courteous reasons: (i) they provide an immediate visual estimate of the mean difference and its 95 % confidence interval, which is the key statistic we base our conclusions on, and (ii) they spare readers from having to cross-reference separate tables.


      Comment 5. For accessibility, please avoid using green and red in the same plot.

      __Response: __We fully agree that red–green combinations can be problematic for colour-vision-impaired readers. In the present manuscript, however, the only panel that juxtaposes pure red and pure green is the Fly-SCOPE co-expression data. These scRNA-seq plots are provided only as supportive reference; the actual quantitative conclusions are based on independent genetic and imaging experiments that use magenta, cyan, yellow, and greyscale palettes. Moreover, the scope images are accompanied by detailed text descriptions of the overlapping cell clusters, so no essential information is lost even if the colours are indistinguishable

      Comment 6. Fly Cell Atlas: please show color scales used for each gene as the color thresholds are gene-specific by default.The 3-color overlap on SCope also makes it very difficult to see the expression pattern for each gene. One possibility is outlining the Kenyon cells on the tSNE plots and showing the expression for each gene of interest.

      Response: Thank you for this helpful suggestion. To avoid the ambiguity that arises from RGB blending in the three-colour overlay, we have added a small colour-mixing diagram next to the t-SNE plots (revised Fig. 1). This key shows the exact hues produced by pairwise and three-way overlaps:

      • Red + Green = Yellow

      • Red + Blue = Magenta

      • Green + Blue = Cyan

      • Red + Green + Blue = White

      Thus, yellow, magenta or cyan dots indicate co-expression of two genes, while white dots mark cells where all three genes are detected. this diagram allows readers to interpret overlap colours at a glance without re-entering SCope.

      Comment 7. Please also refer to Fly Cell Atlas as such. SCope is a visualization platform that houses multiple datasets.

      __Response: __The reference to Fly Cell Atlas was added.

      Comment 8. Please introduce acronyms and genetic reagents the first time they are mentioned.

      __Response: __All acronyms and genetic reagents are now defined upon their first use.

      Comment 9. Line 184: please specify "split-GAL4 reagents" instead of "advanced genetic tools".

      __Response: __We have replaced "advanced genetic tools" with the more specific term "Split-GAL4 reagents."


      Comment 10. Line 187: there are a few other lines with p>0.05 or p>0.01, so "uniquely" is inaccurate. Are the p-values in Table 1 corrected for multiple testing?

      __Response: __The term "uniquely" has been revised for accuracy. No correction for multiple testing was applied because each entry in Table 1 represents a single pairwise comparison (naive vs. exp). Thus only one p-value was generated per experiment.

      Comment 11. Some immunofluorescence panels lack scale bars.

      __Response: __Scale bars have been added to all immunofluorescence panels.


      Comment 12. Fig S2G-I: do authors mean "naive" instead of "group"?

      __Response: __The term "group" in Fig S2G-I has been corrected to "naive."

      Comment 13. Movie 1 should be referenced when YL neurons are first introduced.

      __Response: __Movie 1 is now referenced when YL neurons are first introduced in the text.

      Comment 14. Is Fig 4L similar to Fig 6L-N?

      __Response: __This error has been corrected after the article was reformatted

      Comment 15. Fig 7: please plot olfactory conditioning experiment results as either percentages, preference index, or paired numbers. "Number of flies/tube" is not as informative.

      __Response: __Thank you for pointing this out. The bars in Fig. 7 indeed represent paired numbers, but we realise this was not stated explicitly. We apologize for the lack of clarity. In the revised manuscript we explained it in detail in figure legend and method. In the figure, we also marked the percentage of flies that chose to avoid the side of the stimulus with gas, and explained it in the Figure legend.




      Reference

      1. Lagasse F, Devaud J-M, Mery F. A Switch from Cycloheximide-Resistant Consolidated Memory to Cycloheximide-Sensitive Reconsolidation and Extinction in Drosophila. J Neurosci. 2009;29: 2225–2230. doi:10.1523/jneurosci.3789-08.2009
      2. Huang C, Maxey JR, Sinha S, Savall J, Gong Y, Schnitzer MJ. Long-term optical brain imaging in live adult fruit flies. Nat Commun. 2018;9: 872. doi:10.1038/s41467-018-02873-1
      3. Tonoki A, Davis RL. Aging Impairs Protein-Synthesis-Dependent Long-Term Memory in Drosophila. J Neurosci. 2015;35: 1173–1180. doi:10.1523/jneurosci.0978-14.2015
      4. Macartney EL, Zeender V, Meena A, Nardo AND, Bonduriansky R, Lüpold S. Sperm depletion in relation to developmental nutrition and genotype in Drosophila melanogaster. Evol Int J Org Evol. 2021;75: 2830–2841. doi:10.1111/evo.14373
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Sun et al. show that Orb2-expressing, glutamatergic mushroom body neurons (YL neurons) are central to the "shorter mating duration (SMD)" behavior, where males reduce their mating duration up to 12 hours after the initial copulation. The authors use SMD as a model for understanding sexual experience-dependent long-term memory in males. A few genes implicated in long-term memory (Fmr1, CrebB) are required in YL neurons for SMD. The Nmdar-CaMKII signaling pathways is also implicated, and mating attenuates Ca2+ signaling and increases synaptic plasticity in the mushroom body and subesophageal ganglion.

      Major comments:

      1. Clearer articulation of the rationale, motivation, and significance of the overall study design and individual experiments can strengthen the manuscript and promote readership. For example, the beginnings of the abstract and introduction should define what authors mean by sexual experience-dependent long-term memory and its significance (including why it is "significant for reproductive success" (lines 46 and 92)). Similarly, employing more concrete language throughout the text will help anchor and contextualize the study. Interpretation is occasionally insufficient or does not follow directly from the data provided.
      2. Long term memory: I do not work on Drosophila memory, but a cursory search suggests that the field generally considers long term memory in Drosophila to last for 24 hr to days (courtship memory lasts for >24 hr). SMD decays between 12-24 hr after copulation. Could SMD be considered a short-term effect?
      3. Fig 1B-E share the same control (naive) group. If these experiments were performed in the same replicate(s), they should be plotted in the same figure. If not, please provide more details on how experimental blocks were set up and how controls compared between replicates.
      4. Serial mating (Fig 1F-H): please provide details on the methods. How much time elapsed between successive matings? Is a paired statistical test used? Sperm depletion also affects mating duration, and without this information the authors' conclusion (lines 155-156) does not automatically follow from the data.
      5. Mating duration assay: Which isolation interval was chosen for the rest of the SMD experiments? The 12 hr en masse mating setup is relatively uncommon among studies on courtship/copulation/post-copulatory phenotypes, and introduces uncertainty and variability in the number and timing of matings that occurred during the 12 hr-window. This source of variability and its implication in interpreting the data should be acknowledged. Moreover, the 3 studies referenced in the methods all house males in groups of 4, whereas this study uses groups of 40. Could density confound the manifestation of SMD?
      6. SMD behavior: comparing orb2 mutants and controls (Fig 1M and Fig S1K-L), loss of orb2 actually reduces the mating duration in native males (mean ~15 min) relative to controls (~20 min), and have possibly no effect on experienced males (~15 min). This is inconsistent with the SMD behavior demonstrated in Fig 1B-E. The same pattern is found for mushroom body silencing (Fig 1P, Fig S1M-N), orb2 knockdown in YL neurons (Fig 2D, Fig S2A-B), Fmr1 knockdown in YL neurons (Fig 3D, Fig S2B, S3D) and most other experiments where mating duration is not significantly different between naive and experienced males. This might demonstrate a separate role of YL neurons and its related circuit in regulating mating duration in naive males. Could the authors discuss this interpretation? As an aside, plotting genetic controls next to experimental groups is customary and facilitates comparisons between relevant groups.
      7. Bitmap figures: unfortunately the bitmap figures are compressed and their resolution makes it difficult to evaluate the visual evidence.
      8. Sexual dimorphism of YL neurons: many neurons involved in sexual behaviors express dsx and/or fru. Do YL neurons express them? If they do, they might be a subset of characterized and named dsx/fru neurons.
      9. Genetic controls for some crucial experiments are not provided, e.g. Fig 2J, Fig S3C, Fig S3E-F Fig 5B-C, F, Q-R, Fig S5A-E.
      10. Colocalization experiments: please provide more detail on how fluorescence is normalized for each channel across images, especially when the overall expression of an effector is up- or down-regulated after mating.
      11. Please resolve this apparent contradiction on the expression of Nmdar1 and 2 in YL neurons. On line 261: "both receptors co-expressing in Orb2-positive MB Kenyon cells"; on line 279-281 "Nmdar1 is not expressed with YL neurons [...] whereas Nmdar2 is expressed in a single pair of YL neurons in both male and female brains".
      12. Particle analysis (Fig 6H-J): experienced males seem to have more synapses but trend towards smaller average size. It would be helpful to show number of synapses and average size as paired data, or show that the total particle area is larger in experienced males.

      Minor comments:

      1. Some figures (e.g. Fig 3M-R) and experiments (e.g. oenocyte scRNA-seq) are not referenced in the text. dnc data is shown alongside amn and rut but the rationale for its inclusion is not provided.
      2. Some references might not point to the intended article (e.g. ref 123).
      3. Please plot genetic controls next to experimental genotypes as they are a crucial part of the experiment.
      4. The "estimation statistics" plots are not necessary since the authors show individual data points. To further enhance data transparency, the authors may consider reducing the alpha and/or dot size so the individual data points are more readily visible.
      5. For accessibility, please avoid using green and red in the same plot.
      6. Fly Cell Atlas: please show color scales used for each gene as the color thresholds are gene-specific by default.The 3-color overlap on SCope also makes it very difficult to see the expression pattern for each gene. One possibility is outlining the Kenyon cells on the tSNE plots and showing the expression for each gene of interest.
      7. Please also refer to Fly Cell Atlas as such. SCope is a visualization platform that houses multiple datasets.
      8. Please introduce acronyms and genetic reagents the first time they are mentioned.
      9. Line 184: please specify "split-GAL4 reagents" instead of "advanced genetic tools".
      10. Line 187: there are a few other lines with p>0.05 or p>0.01, so "uniquely" is inaccurate. Are the p-values in Table 1 corrected for multiple testing?
      11. Some immunofluorescence panels lack scale bars.
      12. Fig S2G-I: do authors mean "naive" instead of "group"?
      13. Movie 1 should be referenced when YL neurons are first introduced.
      14. Is Fig 4L similar to Fig 6L-N?
      15. Fig 7: please plot olfactory conditioning experiment results as either percentages, preference index, or paired numbers. "Number of flies/tube" is not as informative.

      Significance

      The manuscript describes an extensive and comprehensive set of experiments aimed at elucidating the role of a subset of mushroom body neurons in mediating a male post-mating sexual behavior, which the authors use as a model for sexual experience-dependent long-term memory. Long-term post-mating responses in females have been well characterized in Drosophila and other insects, but post-mating long term memory in males are less well understood despite a few studies reporting their importance in mating success. How males adjust their mating duration based on internal and external cues can reveal insights about decision making and interval timer mechanisms. This study represents a functional advancement in the neuronal and molecular mechanisms of how memory and experience regulates a sexual behavior in male Drosophila. Overall, the manuscript can significantly benefit from general editing on clearer articulation of rationale and more appropriate interpretations of data. Higher resolution versions of bitmap figures is also crucial. The SMD experiments invite an alternative interpretation of data that centers on YL neurons' role on regulating mating duration in naive males, which alongside other roles of the mushroom body demonstrated in this manuscript, could add more depth to the study.

      The findings in this manuscript will be of interest to a specialized audience interested in memory, neural circuits of behavior, and Drosophila sexual behavior. I work on Drosophila sexual behavior and circuits, but lacking experience on memory research, I am not as familiar with the mushroom body and conditioning experiments.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study explores the molecular and neural circuitry mechanisms underlying sexual experience-dependent long-term memory (SELTM) in male Drosophila. The authors use behavioral, imaging, and bioinformatics approaches to identify YL neurons, a subset of mushroom body (MB) projecting neurons, as crucial for SELTM formation. They propose that YL neurons receive inputs from WG neurons via the sNPF-sNPFR pathway and implicate molecular players such as orb2, fmr1, MDAR2-CaMK, and synaptic plasticity in their function.

      However, the evidence presented does not adequately support the authors' claims. The data fail to cohesively tell a logical story, and key conclusions appear to be based on assumptions and correlations rather than robust evidence.

      Major comments:

      1. The study identifies the knowledge gap (lines 103-104) but fails to integrate relevant literature, particularly Shohat-Ophir et al., Science (2012), and Zer-Krispil et al., Curr Biol (2018). These studies established that ejaculation induces appetitive memory in male Drosophila via corazonin and NPF neurons. The current study does not provide direct evidence that the "act of mating itself" drives SELTM, as it includes both courtship and copulation.
      2. The nature of the observed long-lasting reduced mating duration requires clearer characterization: Is this an associative memory or experience-dependent behavioral plasticity? Can the formation of this long-term memory be blocked by protein synthesis inhibitors, such as cycloheximide?
      3. While schematics illustrate the working hypotheses, the text lacks detailed explanations, leaving the reader unclear about the rationale behind certain conclusions.
      4. The logic to draw certain conclusions was confusing and misleading.
        • For instance, the role of orb2 in SELTM is examined via knockdown in MB Kenyon cells (KCs) (using ok107>orb2-RNAi), which is irrelevant to the claim that orb2 functions in YL neurons. Additionally, RNAseq analyses (Fig. 1N-S) focusing on orb2 expression in a/b KCs are irrelevant to and cannot support the claim that Orb2 functions in YL neurons.
        • Similarly, the claim (lines 302-303) that sNPF-R expression is exclusive to MB KCs conflicts with data showing effects when sNPF-R is knocked down in YL neurons. How can knocking-down a gene, which is exclusively expressed in neural population A, in neural population B affect a phenotype? This inconsistency undermines the interpretation of the results.
        • Other examples include lines 223-227 and lines 246-249. It is very confusing how the authors came to the indications.
        • The authors also kept confusing the readers and themselves by mistakenly referring to MB KC a-lobe and YL a-lobe projection. They may know the difference between the two neural populations but they did not always refer to the right one in the text.
      5. The imaging figures provided are unfocused and poorly resolved, making it difficult to assess data quality.
        • Colocalization analyses of orb2 and YL are unconvincing, especially given that orb2 is well-documented in literature as expressed in MB a-KCs and YL projection wrapping MB a-lobe. Maximum intensity projection images are insufficient for confirming colocalization; complete image stacks with staining of orb2, YL, and KCs (MB-dsRed) are needed for validation.
        • Quantification of imaging data appears flawed. For example, claims of orb2 and CaMKII upregulation in MB a-lobe projections (e.g., Fig. S2F-J, Fig. 3M,N) are confounded by widespread increases in intensity across the brain, lacking specificity.
        • The TRIC experiment analysis should normalize GFP signals to internal reference channel (RFP in the TRIC construct), as per established protocols in the original paper.
        • In Fig. 6H-J, methods for counting synapse numbers are not described. How are synapse numbers counted in these low-resolution images?
      6. The study presents data from unrelated learning paradigms (e.g., olfactory associative learning, courtship conditioning; Fig. 7) without justifying how these paradigms relate to SELTM. Particularly, the authors claimed that SELTM is related to Gr5a, which leads to appetitive memories, which involve PAM dopaminergic neurons and MB horizontal lobes. However, the olfactory associative learning with electric shock and courtship conditioning lead to aversive memories, that involve PPL1 dopaminergic neurons and the vertical lobes.
      7. Some figures are not referred to in the text. For example, Fig S1 K and L (also, what's the difference between these two figures?) and Fig 3M-R. What is MB-V3 in Fig 4J-K?

      Minor issues

      1. Fig 2F. YL projections are labeled as MBONs. Clarify whether YL neurons are the upstream or downstream (MBON) of KCs.
      2. Extensive language polishing is required, as several sentences are unclear (e.g., lines 169-172).

      Significance

      This study potentially advances our understanding of how sexual experience modifies future mating behaviors. While previous work has shown that mating induces appetitive memory in males, the mechanisms linking this memory to future mating behavior remain poorly understood. This work could provide valuable insights into these mechanisms, pending appropriate revisions.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The manuscript presents IGNITE (Inference of Gene Networks using Inverse kinetic Theory and Experiments), an unsupervised machine learning framework for constructing gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. IGNITE utilizes a kinetic inverse Ising model to infer gene interactions from binarized expression data and can predict genetic perturbation effects, such as those from knockout experiments. Although the application of inverse Ising models to network reconstruction is not entirely novel, IGNITE's specific implementation and its application to single-cell RNA sequencing data represent a new development. The method is tested on the transition from naive to formative states in murine pluripotent stem cells, a system the authors are highly knowledgeable about, and its performance is compared to state-of-the-art alternative methods.

      Major concerns

      My concern regards the generality of the method, particularly the entire pipeline presented, and the fairness of the performance comparison. These concerns can be easily addressed by the authors by better explaining their choices and their general applicability, and by toning down the conclusions about the comparison with existing inference methods.

      The pre-processing steps are extensive, and their rationale is not always clear, though the results heavily depend on this analysis. Several steps appear to involve arbitrary choices optimized for specific outcomes, potentially introducing biases. The authors should better explain the rationale behind their choices to mitigate these concerns.

      Specifically, part of the pipeline seems to be built to reproduce a specific expression pattern of 24 genes that some of the authors discovered in a previous paper. Although this prior knowledge could be useful and relevant in this specific system, it could limit the generality of the method. For example, the authors selected approximately 2000 genes based on prior knowledge and used a combination of t-SNE and UMAP for dimensionality reduction (although the two techniques have a similar goal). This specific combination seems to reproduce the pseudotime alignment the authors were expecting to find, but such prior information might not be available in general. Therefore, feature selection and the methods used to project data need more justification, especially if the goal is to create a general tool applicable across different biological systems.

      Analogously, the clustering seems manually adjusted to match known expression patterns of 24 relevant genes, rather than being the result of an optimized clustering method. Additionally, the clusters overlap with different time points, raising concerns about potential batch effects. These issues should be addressed to strengthen the validity of the method.

      The claims about the comparison with existing methods should be toned down. While the comparisons are useful and interesting, they might be biased due to the method's fine-tuning for the specific system studied. The claim that the model requires only scRNA-seq data is misleading, as strong prior biological knowledge was used to select, for example, the genes analyzed.

      Significance

      The manuscript is scientifically sound, clearly written, and deserves publication. The proposed method is quantitative, novel, theoretically grounded, and was tested in detail with appropriate null models and statistical methods. Moreover, IGNITE can be applied to various biological systems as the availability of scRNA-seq datasets is continuously growing. The paper will be of interest to a broad community of computational biologists and biology labs interested in gene regulation using scRNA-seq data.

      The limitation, in my opinion, is the method's (particularly the pre-processing pipeline) fine-tuning for the specific biological system tested. Testing IGNITE on another biological system without pre-selected pre-processing steps or detailed biological priors would be more convincing and make the paper's conclusions much stronger. The comparison with other methods also may be slightly biased due to this fine-tuning.

      My background is in statistical physics, with expertise in biological physics, specifically in mathematical modeling and data analysis in molecular biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Corridori et al introduce IGNITE, a computational framework to infer gene regulatory networks (GRNs) from scRNA-seq data leveraging the kinetic Ising model, which can be used to simulate synthetic gene expression and perform in-silico knockout experiments. Other similar frameworks exist, but none combine these three aspects together. The authors have generated a scRNA-seq of murine ESCs differentiation which they use to compare their method with others. Specifically they show that they can infer known regulatory interactions, that they can generate similar data than the original and that it can potentially predict gene expression changes in transcription factor knock-out perturbations.

      Major comments:

      • Many of the authors' claims are backed by qualitative results and not properly quantified. In Fig2, authors qualitatively compare intra gene correlations between genes for the original data and their prediction. Instead of just visualizing they should compute and report the Spearman correlation between the original expression and the predicted one. The Fraction of Agreement is not a good metric to compare knockout predictions since it is completely dependent on the class imbalance of signs, for example if the selected genes are 75% positive and 25% negative, a naive predictor that only outputs positive predictions will still have a high score. Instead, the authors should quantify this with Spearman correlation or RMSE and compare across methods. In FigS4a-b the authors qualitatively claim that other methods could not predict the expected cell composition, which they should quantify and report the values across methods. When comparing against the ground truth network, the fraction of correctly inferred interactions is technically the same as precision but is ignoring recall. I suggest the authors compute precision, recall and a combined F1 score to compare the evaluated methods. Authors claim that the method is scalable to a larger number of genes but no data is provided, they should show how their method compares to others when using a different number of cells and number of genes at memory usage and running time.
      • The authors need to better describe which tests were performed when talking about significance, which thresholds and which corrections, if any, were employed.
      • To reduce the number of dimensions of scRNA-seq data the authors use t-SNE and then from the obtained result UMAP to project the data into a lower dimensional space. This is fundamentally wrong since distances are not well preserved in t-SNE. Instead the authors should first employ PCA and then UMAP. Additionally, the authors use UMAP distances in the Slingshot pseudotime calculation. Similar to t-SNE, UMAP distances have no real meaning and should only be used for visualization purposes. Instead, the authors should provide Slingshot the obtained PCA embeddings.
      • Dictys (PMID: 37537351) is a known GRN inference method that also can simulate gene expression but is missing in the benchmark, the authors should add it to the method comparison.
      • The current manuscript is not reproducible since it is missing the method's code, the code to reproduce the figures and the generated scRNA-seq data.
      • Authors claim that the method is scalable to a larger number of genes but no data is provided to back this claim. They should show how their method compares to others when using a different number of cells and number of genes.

      Minor points:

      • In the introduction, authors mention multimodal GRN inference methods but do not provide any references.
      • In Table 1, CellOracle is annotated as not being able to do multiple KO which is wrong. Additionally, the authors mention that IGNITE uses no prior knowledge which is not really true since it requires pseudotime ordering. The authors should add a column to Table 1 whether methods require pseudotime.
      • It is unclear what the dashed arrow of Fig1b means. Moreover, plotting gene expression values on top of UMAPs can be misleading, instead authors should plot the gene expression distributions binned by pseudotime.
      • The authors report a p-value of 1.04x10-171 which is below detection limit (see PMID: 30921532). Authors should change it to an interval such as p < 2.2×10-16.
      • To make CellOracle results easier to interpret and more comparable, authors should run it at the atlas level instead of at the cell type level, this way generating only one GRN. This can be achieved by assigning the same cluster label to all cells.
      • Experimental values in FigS3b seem to have been repeated and do not match the previous ones for IGNITE and SCODE.
      • It is unclear what the different circles mean in Fig5b.

      Significance

      This manuscript is an incremental and methodological work for specialized audiences. Its strengths are that the authors employ kinetic Ising model for GRN inference and that they provide a single framework capable of inferring, simulating and perturbing gene expression. The main limitations are that the claims should be better quantified and that the code and data need to be made accessible.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Corridori and colleagues propose IGNITE, a novel method to recover Gene Regulatory Networks (GRN) from single cell RNA-sequencing (scRNA-seq) data. Their method solves the inverse Ising problem generating a cohort of candidate GRN optimising it to minimise the difference to the input expression matrix. Authors report the IGNITE is able to predict wild type data and simulate both single and multiple gene knockouts. Authors benchmark this method on a in-house data set of differentiating pluripotent stem cells (PSC). They focus on a small set of genes known to be involved in PSC differentiation into formative cells. Authors benchmark IGNITE against state of the art tools (SCODE, MaxEnt and CELLORACLE). They evaluate IGNITE ability to predict wild type gene expression by comparing their data with experimental data and with SCODE. They conclude the tool has generative capacity comparable with SCODE. They also evaluate IGNITE ability to recover known interactions with respect to other tools without finding it to significantly outperform them.

      Major comments

      • Are the key conclusions convincing?

      Conclusions appear convincing although model generalizability could be shown in a more thorough manner. For instance, analysing some other publicly available dataset could help demonstrate hyperparameters effects on GRN predictions and their robustness across different experiments. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Claims are well supported by data. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I think the work would benefit from an additional benchmark on a different cellular system. This experiment would show how hyperparameters generalise across datasets and would provide potential users insights how to tweak them.

      Also, how does the model scale with the number of genes? A benchmark on computation time and resources required to infer GRN of growing size would be valuable in the adoption of this tool.

      In addition, I think the GRN comparison benchmark presented in section (3.4) would benefit from a quantitative discussion. Authors show inferred GRNs in Figure 4 and S5. For instance, measuring matrix similarity (when appropriate) would help understanding how predicted GRN compare. I understand authors attempt to do so by focusing on validated interactions and computing the fraction of correctly inferred interactions (FCI) but I think a measurement of the overall similarity (eg. Pearson correlation) would add on this.

      Another comment regards the dependency between Correlation Matrices Distance (CMD) and FCI, shown in Figure 5. I understand that IGNITE GRN that maximise FCI are not the same that minimise CMD. However, it looks like GRN that maximise FCI have higher value in terms of biological information. I wonder whether optimization for one or the other metric could be left to the end user as a tunable parameter.

      Authors should discuss why the expression of some genes does not follow the expected trends (Fig 1C vs Fig S1A). Out of the 24 genes they select for their analysis, at least four do not follow the expected trends: Sox2, according to literature, is a Naive gene, however, in Figure 1C its gene expression pattern is more similar to Formative late genes. Other genes with similar "unexpected" patterns are Zic3, Etv4 and Sall4.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      I think suggested experiments are doable as long as authors get publicly available data, i.e. the in-house dataset they generated for this study is enough to show applicability. For example datasets analysed in SCODE paper (https://doi.org/10.1093/bioinformatics/btx194) could be used as second benchmark. The point of applying the tool to another dataset is to show how it generalises across different biological systems, experiments and, potentially, sequencing technologies. - Are the data and the methods presented in such a way that they can be reproduced?

      The methods section is really clear. To enable reproducibility both raw scRNA-seq data, the IGNITE source code and code written to benchmark it should be released in the public domain in appropriate repositories (eg. ENA, GitHub, Binder etc). - Are the experiments adequately replicated and statistical analysis adequate?

      Yes.

      Minor comments

      • Specific experimental issues that are easily addressable.

      Related to the Sox2 expression pattern is the binarization shown in Figure 2D. How is it possible that Sox2 is always marked as active? Could the authors clarify how these outlier behaviours emerge and propose mitigation strategies, if any?

      In section 5.11.2 it is unclear if xi are in log scale or not. Since the model starts from binarized, log transformed expression values, should not generated ones be in the same scale as the input? - Are prior studies referenced appropriately?

      Yes, referencing is clear. - Are the text and figures clear and accurate?

      Yes, figures appear to be clear, readable and well documented both in captions and main text. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Section 3.3 could be improved by better describing experimental datasets. Only in the methods section it is clearly stated that experimental data for single KO experiments were retrieved from the literature.

      Check typesetting:

      • parenthesis missing in Eq. 1
      • Leftover $ in section 3.1
      • Parenthesis missing in Section 3.3
      • Misplaced comma in section 5.2.1

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper presents a method to infer GRN from scRNA-seq data alone. Applications include GRN prediction and their perturbations. This paper represents a technical advance in the field as it is the first application of the inverse Ising problem GRN inference. - Place the work in the context of the existing literature (provide references, where appropriate).

      The paper itself presents the landscape of GRN inference tools using scRNA-seq data: SCODE, MaxEnt and CELLORACLE. More tools exist, for instance SCENIC (https://doi.org/10.1038/nmeth.4463) mainly relies on co-expression matrices. Other tools exist but require additional data types e.g. GRaNIE and GRaNPA (https://doi.org/10.15252/msb.202311627) leverage on physical interaction data (ATAC-seq, ChIP-seq). Similarly DeepFlyBrain uses deep neural networks to infer eGRN in Drosophila (https://doi.org/10.1038/s41586-021-04262-z). The value of tools like IGNITE and its competitors is that they do not require additional data types, which, in turn, helps in controlling experimental costs. - State what audience might be interested in and influenced by the reported findings.

      The paper might be of interest to biologists interested in regulation of gene expression. The tool might turn out to be useful in planning experimental work by guiding the choice of perturbations to introduce in experimental systems. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am a computational biologist.

      I have no sufficient expertise to evaluate the mathematical details of the method.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We appreciated the positive, detailed and helpful feedback from all three reviewers.

      Reviewer 1.

      Minor comments.

      1. In the introduction, on page 2, the authors seem a little confused about the Plk1 Polo-box domain - text as written: "...kinase domain linked to tandem Polo-box domains (PBD)", and cite a review paper. Actually, there is only a single Polo-box domain in these kinases, which contains both Polo-boxes and a bit of the upstream linker region. The "PBD" terminology denotes his 2-Polo-box +linker structure. Perhaps it would be better here to cite the PBD structure (Elia et al., Cell, 2002) as a primary citation here.

      Response: Thank you for finding this error, the text has been updated and the new citation included within the text on line 65.

      1. Similarly, the line "...during the G2/M transition following successful DNA damage repair" cites the Seki et al paper, but those findings are shown in the Macurek et al paper, not the Seki et al paper.

      _Response: _Thank you for finding this error, the new citation included within the text on line 69.

      1. Using the model of the ternary complex as shown in Figure 1B, deletion constructs of Bora missing regions within the disordered loops, but still retaining the residues that bind the PBD, FW pocket and Aurora A, can be modeled and tested to see if such deletions can improve the ipTM scores and binding affinity.

      Response: ____AlphaFold3 modelling was attempted with shorter regions of Bora to see the effect on the ipTM scores. Unfortunately, when Bora was reduced to shorter sequences, such as 18-88 or 18-45 modelled with 68-120, the models became inconsistent and of a low quality. Models were also created including the short region of Bora surrounding Ser252 that interacts with the polo box domain as well as Bora 18-120, but this had minimal effect on the calculated iPTM scores.

      1. On page 5, "S112A" within the sentence "Unexpectedly, the F56A/W58A Bora was less efficiently phosphorylated on S112A (Supplementary Figure S11, F compared to H and Supplementary Table S4)." This should be "S112".

      Response: ____Thank you for spotting this, the error has been corrected.

      1. In the assays shown in Figure 2D, the presence of excess F56AW58A Bora that remained unphosphorylated on S112 may complicate the interpretation of the results. Can the authors show that the S112-phosphorylated F56AW68A Bora is predominantly bound to Aurora A in such a mixture, perhaps by NMR using labelled pS112 F56AW58A Bora and unlabeled S112 F56AW58A Bora?

      _Response: _15N13C labelled of Bora 18-120 F56A W58A was produced and assigned. We then phosphorylated a sample using ERK2, tracking with NMR, and when the reaction had progressed to a 50:50 mixture of pSer112 and Ser112 (based on peak intensities) the kinase activity was quenched by addition of EDTA to sequester Mg2+. This produced a solution containing both pS112 and unphosphorylated S112 Bora species with marker peaks in HSQC spectra that could be used to directly compare Aurora-binding to the two species. Aurora-A was introduced to the sample and the peak intensities were monitored. Although both species are affected, there is much greater peak loss from the pS112 related peaks than those for unphosphorylated S112. This indicates that Aurora-A still preferentially binds pS112 Bora over S112 Bora when the F56A W58A mutation is present. This data has been included in Supplementary Figure S11.

      1. Please expand Figure 3A to better show the FW pocket-forming residues on Plk1.

      Response: ____Figure 3 has been amended to reduce the size of the sequence alignments so that 3A could be made slightly larger.

      1. It would be helpful to label the peaks in the mass spectra in Fig. S11 with the phospho-species that they correspond to.

      Response: ____This information has been added to the mass spectra in Fig. S11 (now supplementary Figure S14) to make them easier to view.

      1. In the last paragraph on page 7, "see we" in the sentence "As well as a decrease in intensity around pSer112 in Bora, see we an overall effect with decreased intensity across most of the Bora sequence." Should be corrected to "we see".

      Response: ____Thank you for spotting this, the error has been corrected.

      1. While not required, it would be helpful if binding or Bora to Aurora A after Erk2 phosphorylation could be shown using fluorescence polarization or ITC to lend additional support to the NMR data for S112 and S59 phosphorylation and for CEP192 and TPX2 competition.

      Response: ____This question has been partially answered in previous work by Tavernier et al. (2021), who showed improved binding of Aurora-A to Bora after Erk phosphorylation (by SPR), and they used labelled-TPX2 for a series of competition FP assays in that and the recent parallel study (Pillan et al. 2025).

      We made initial efforts to perform additional FP assays using longer sections of Bora with different phosphorylation states but without success (perhaps due to the multisite-binding nature of the Bora–Aurora interaction, and difficulties with directly expressing phosphorylated Bora). The revised manuscript now includes some additional NMR data to show improved Bora–Aurora-A interaction after phosphorylation at Ser59 (Supplementary Figure S12).

      1. The Aurora A phosphorylation motif has been further defined beyond that reported by the Pinna lab in 2005. Notably, the Ser-59 sequence on Bora (F-R-W-S-I), has, in addition to dominant selection for AR in the -2 position, both favorable -1 (W) and +1 (I) positions based on peptide library measurements (Alexander et al., Science Signaling 2011), further arguing that it may be an excellent Aurora A phosphorylation site.

      Response: ____Thank you for highlighting this publication and how it further reinforces the likelihood of Ser59 being an effective substrate for Aurora-A, this should have been included in the original manuscript. This citation has now been included.

      1. Have the authors tried to model the Drosophila melanogaster Aurora A-Bora-Polo complex to see if the Asn substitution of Bora Ser59, and the expected loss of the interactions between Bora pSer59 and Plk1 Arg59 and Aurora A Arg205 are compensated by other features?

      Response: ____A ternary complex between the Drosophila melanogaster orthologues was modelled using AlphaFold3 (Uniprot code PLK1 (Q9VVR2 72-165), Aurora-A kinase (Q9VGF9) 151-411 and PLK1 (P52304 21-280)). This model was analysed using PDBe PISA to identify potential interactions between the three proteins, focusing on residues that are not conserved between the human and Drosophila sequences. From this model a potential salt bridge was identified between Drosophila Bora Lys120 and PLK1 Glu93 that would not occur in the human ternary complex given Lys120 is replaced with an asparagine. This could be an alternative (kinase-independent) method for improved Bora-PLK1 interaction. When comparing the Bora:Aurora-A side of the predicted interface and focusing on the short region of Bora in between Aurora-A and PLK1, there were no clear differences seen in the residues predicted to bind to Aurora-A. This modelling has been included in Supplementary Figure S10 C and D.

      1. Given the relevance of the recent publication from Zhu et al. to this study, the authors may want to comment on, or test, the relative importance of PKA and Aurora A as a potential kinase for Bora S59. While those authors argue that PKA phosphorylates Bora on Ser-59, one could easily imagine a model in which either PKA or Aurora A could initially phosphorylate that site followed by a propagation step after initial Aurora A activation, in which Aurora A phosphorylation of Bora Ser-59 is the dominant process.

      Response: ____A brief discussion of this recent publication has been added to the discussion, highlighting the similarities between the two publications and the importance of pSer59, as well as suggesting that in cellulo this modification could be achieved via more than one pathway. We also include some additional NMR data to show improved Bora–Aurora-A interaction after phosphorylation at Ser59 (Supplementary Figure S12).

      Reviewer 2.

      Minor comments.

      Page 5: '... a K82R PLK1 mutant was used to increase the stability of the protein' - It is not clear how this mutation confers increased stability of the protein. The authors do not show any data to support this. Isn't the PLK1 K82R an ATP-binding-deficient, kinase-inactive mutant?

      Response: ____Thank you for spotting this, the text has been updated to clarify that this version of PLK1 was used as it is acting as a substrate in the in vitro assay as we didn’t want to see any PLK1 activity within this assay.

      All panels showing the Alphabridge diagram - it would be helpful if pictorial definitions of the colour codes were provided with corresponding score ranges (in addition to the description in the figure legend).

      Response:____The AlphaBridge images have been updated to include details about the plDDT scores each of the different colours refer to.

      Fig 2B - The Fluorescence anisotropy assay curves do not reach a plateau. Though the effect of mutation on binding affinity is pretty clear, if possible, I suggest including more data points at higher concentrations and estimating apparent Kd values.

      __Response:____The direct binding assay was repeated with a higher concentration of PLK1 in order to try and see a top plateau. This was successful and has been included in Figure 2B (shown in black). The measured Kd was 24 ± 3 µM. __

      The cartoon representation of the structures and molecular interfaces - better to avoid shadows, as they compromise the clarity of the figures, particularly the ones where side chains are shown in stick representation.

      Response:____The structural images have been remade to remove the shadows and improve the clarity of the images.

      It is important to discuss how the parallel studies by Verza et al. and Pillan et al. complement this study, highlighting similarities and differences.

      Response:____References to these two publications and details on the similarities and differences seen are now included in the discussion.

      Reviewer 3.

      Major comments

      It would be helpful to measure the level of pThr210 PLK1 in some experiments and graph the data. The current presentation is Fig. 2D-E is qualitative rather than quantitative.

      Response:____Graphs displaying the levels of pThr210 produced in the assay are now shown in Supplementary Figure S4.

      Have the authors measured the binding affinity of the F/W mutant Bora for PLK1 using the assay in Fig. 2B? Likewise, for Fig. 7 the S59 mutant could be tested to see if it affects PLK1 binding or activation.

      Response:____The direct binding assay has been repeated with the use of a FAM-Bora peptide that incorporates the F56A W58A mutation which shows reduced binding (Figure 2B, shown in blue). A version of the Bora peptide phosphorylated on Ser59 was also tested in the direct binding assay and this shows a similar affinity for PLK1 to the wild-type sequence (Figure 2B, shown in red compared to the wild-type shown in black).

      It would be helpful if measurements of pThr210 PLK1 for all conditions were shown in the graph Fig. 7F.

      Response:____This graph has been updated to include the levels of phosphorylation seen for PLK1 in all of the conditions tested.

      Minor comments

      I found Figure S1B easier to understand than Fig S1A and Fig 1A-B. Some of the supplemental data Fig. S1C-E could be moved to a revised Figure 1, dropping the current Fig. 1A-B. Can the interaction plots (Fig. S1C-D) be rotated to have the same original at the top and order of proteins (i.e. Bora > Aurora A > {plus minus} PLK1 depending on the plot).

      Response:____Figure 1 and S1 have been rearranged to hopefully make them easier to understand, with all AlphaFold3 models of the full-length sequences kept in the supplementary figure and the focus in 1B just on the truncated model. The AlphaBridge plots have been rotated as suggested.

      Figure 3F. Typo "Strongyl" not "Strongly".

      Response:____Thank you for spotting this, this has been corrected in the updated manuscript.

      Figure 3 could be supplemental material.

      Response:__Thank you for your suggestion, but we have decided to keep this as a main figure.

      Fig. 7E. Run a positive control reaction +ERK2 on the second gel to allow direct comparison of pThr210 across all the conditions tested.

      Response:____These samples have been rerun on the same membrane and the levels of phosphorylation have been quantified and included in Figure 7F.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary.

      Miles and co-workers have carried out a careful and high-quality study of the activation mechanisms of the mitotic kinase PLK1. Multiple proteins have been implicated in PLK1 activation and localisation as cell enter and pass through mitosis. Initial activation of PLK1 is promoted by a complex of Bora with another kinase Aurora A. Later in mitosis, this activated PLK1 associates with mitotic spindle and centrosome proteins regulating different aspects of mitosis and cytokinesis. In this study, Miles et al. extend previous work on this question by proposing and testing detailed models for Bora/Aurora A-mediated activation of PLK1 to elucidate the mechanism of this reaction.

      Using the latest Alphafold they generate a series of models of the PLK1/Bora/Aurora A complex to home in on the key regions mediating interactions of the three proteins. This approach suggests an arrangement where the first ~120 amino acids of Bora wrap Aurora A and create an interaction surface for the N-terminal kinase domain of PLK1. This orients Thr210 in PLK1 towards Aurora A creating a situation likely favourable for phosphorylation, although has the authors discuss there are some caveats to this. A further prediction of the modelling helps explain the requirement for Bora phosphorylation to promote the interaction with Aurora A. This data is presented in Fig. 1 and Fig. S1-S3.

      In the subsequent figures the details of this model are tested using biochemical assays and structural biology methods to validate key predictions. First the PLK1 interaction with Bora was shown to require the conserved F/W motif of Bora and a conserved pocket close to R106 on PLK1 (Fig. 2 and 3). In reconstituted PLK1 activation assays the F/W motif mutant Bora showed greatly attenuated pThr210 phosphorylation. This reaction also required phosphorylation of Bora at S112, presumably due to the interaction with Aurora A. An R106A mutant PLK1 showed reduced binding to Bora and reduced kinase activation. This data is clear and provides compelling support for the model.

      Using NMR the authors then investigate the interaction between Bora and Aurora A, and more specifically the requirement for Bora phosphorylation at Ser112. The NMR data in Fig. 4 and Fig. 6 provide good support for the Alphafold model. A helpful comparison with known Aurora A binding proteins is also shown to highlight the way CEP192, TPX2 and TACC3 contact a series of conserved pockets on the surface of Aurora A which are common to the Bora interaction. S59 phosphorylation by Aurora A is also shown to play an important role in contacting PLK1 and is required for pThr210 phosphorylation.

      In summary, the authors have made valuable progress in working out details of the PLK1 activation mechanism, that extends previous work in the field.

      Major comments.

      It would be helpful to measure the level of pThr210 PLK1 in some experiments and graph the data. The current presentation is Fig. 2D-E is qualitative rather than quantitative.

      Have the authors measured the binding affinity of the F/W mutant Bora for PLK1 using the assay in Fig. 2B? Likewise, for Fig. 7 the S59 mutant could be tested to see if it affects PLK1 binding or activation.

      It would be helpful if measurements of pThr210 PLK1 for all conditions were shown in the graph Fig. 7F.

      Minor comments.

      I found Figure S1B easier to understand than Fig S1A and Fig 1A-B. Some of the supplemental data Fig. S1C-E could be moved to a revised Figure 1, dropping the current Fig. 1A-B. Can the interaction plots (Fig. S1C-D) be rotated to have the same original at the top and order of proteins (i.e. Bora > Aurora A > {plus minus} PLK1 depending on the plot). Figure 3F. Typo "Strongyl" not "Strongly". Figure 3 could be supplemental material. Fig. 7E. Run a positive control reaction +ERK2 on the second gel to allow direct comparison of pThr210 across all the conditions tested.

      Significance

      Timely and orchestrated activation of multiple mitotic protein kinases is crucial for the alignment and segregation of chromosomes, and for the process of cell division. In this study the authors explore how activation of the mitotic kinase PLK1 is triggered by another mitotic kinase Aurora A, and the role played by a scaffold protein Bora.

      Strengths: Detailed analysis of mechanism using biochemical and structural approaches.

      Limitations: The study is focussed on the biochemical and structural mechanisms rather than the cellular outcomes. Some data would benefit from additional quantitative measurement.

      Relevance: Cancer and cell biology due to the role of Aurora A in many cancers.

      Reviewer expertise: Biochemistry, molecular and cell biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      PLK1 is one of the master regulators of cell division. The activation of PLK1 requires the activation loop phosphorylation at T210, mediated by Aurora A kinase. However, Aurora A phosphorylation of PLK1 T210 requires Bora, one of the several activators of Aurora A kinase. While the molecular requirement of Aurora A kinase and Bora for PLK1 activation is well established, the mechanistic understanding of how Bora facilitates PLK1 activation by Aurora A has remained an important open question for a long time. Exploiting the latest development in AI-driven structure prediction, three independent studies provide a structural and mechanistic basis for PLK1 activation by Aurora A and Bora. Here, Miles et al. have generated AlphaFold models, further characterised some of the interfaces using NMR, and validated the contribution of intermolecular interactions at suggested interfaces in vitro using recombinant proteins in kinase assays. Overall, this is a well-executed work providing important new insights into our understanding of the activation of the critical regulator of cell division, PLK1. However, as the authors have highlighted in the discussion section, one limitation of this modelling study is that the models still do not entirely explain how these interactions facilitate the phosphorylation of Thr210ur, as this residue is oriented far away from Aurora A's active site for the reaction to take place. Despite this limitation, I believe this is an important work that advances our understanding significantly.

      Comments:

      Experimental data satisfactorily support claims. Hence, most of my comments are minor in nature.

      Points to consider during revision:

      Page 5: '... a K82R PLK1 mutant was used to increase the stability of the protein' - It is not clear how this mutation confers increased stability of the protein. The authors do not show any data to support this. Isn't the PLK1 K82R an ATP-binding-deficient, kinase-inactive mutant?

      All panels showing the Alphabridge diagram - it would be helpful if pictorial definitions of the colour codes were provided with corresponding score ranges (in addition to the description in the figure legend).

      Fig 2B - The Fluorescence anisotropy assay curves do not reach a plateau. Though the effect of mutation on binding affinity is pretty clear, if possible, I suggest including more data points at higher concentrations and estimating apparent Kd values.

      The cartoon representation of the structures and molecular interfaces - better to avoid shadows, as they compromise the clarity of the figures, particularly the ones where side chains are shown in stick representation.

      It is important to discuss how the parallel studies by Verza et al. and Pillan et al. complement this study, highlighting similarities and differences.

      Significance

      As highlighted in the summary, a mechanistic understanding of how PLK1 is activated by Aurora A kinase and its activator Bora has remained a long-standing open question. As PLk1 is one of the major regulators of cell division, which exerts its function (via phosphorylating numerous substrates) during different stages of mitosis, understanding its activation mechanism is of critical interest for those working on the cell cycle in general and cell division in particular. A key limitation of this study is the lack of any cellular functional evaluation of the interaction interfaces.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Miles et al. used a combination of AlphaFold modeling, biochemical assays of mutant constructs and NMR spectroscopy to model the ternary complex of Aurora A, Bora and Plk1, and elucidate how Bora can act as a molecular bridge that facilitates the phosphorylation of the activation loop Thr210 within Plk1 by Aurora A. Their studies identified an interaction between residues 52-73 within Bora and the 'FW' pocket on the N-terminal lobe of Plk1, which binds Phe56 and Trp58 of Bora. Additionally, Ser59 of Bora was identified as a good Aurora A substrate using a Bora peptide array, and pSer59 was predicted to form bridging interactions with Aurora Arg205 and Plk1 Arg59. This was supported by NMR and biochemical assays. In addition, the authors validate that phosphorylation of Ser-112 on Bora enhances stabilization of the Aurora A-Bora complex Overall, the model revealed novel details of the interactions within the Aurora A-Bora-Plk1 ternary complex that are supported by the biochemical and NMR data. The work will be of significant interest to basic scientists whose work involves protein kinase signaling, cell division/mitosis, signal transduction, and cancer biology. We recommend publication of this manuscript with the following minor changes and additions.

      1. In the introduction, on page 2, the authors seem a little confused about the Plk1 Polo-box domain - text as written: "...kinase domain linked to tandem Polo-box domains (PBD)", and cite a review paper. Actually, there is only a single Polo-box domain in these kinases, which contains both Polo-boxes and a bit of the upstream linker region. The "PBD" terminology denotes his 2-Polo-box +linker structure. Perhaps it would be better here to cite the PBD structure (Elia et al., Cell, 2002) as a primary citation here.
      2. Similarly, the line "...during the G2/M transition following successful DNA damage repair" cites the Seki et al paper, but those findings are shown in the Macurek et al paper, not the Seki et al paper.
      3. Using the model of the ternary complex as shown in Figure 1B, deletion constructs of Bora missing regions within the disordered loops, but still retaining the residues that bind the PBD, FW pocket and Aurora A, can be modeled and tested to see if such deletions can improve the ipTM scores and binding affinity.
      4. On page 5, "S112A" within the sentence "Unexpectedly, the F56A/W58A Bora was less efficiently phosphorylated on S112A (Supplementary Figure S11, F compared to H and Supplementary Table S4)." This should be "S112".
      5. In the assays shown in Figure 2D, the presence of excess F56AW58A Bora that remained unphosphorylated on S112 may complicate the interpretation of the results. Can the authors show that the S112-phosphorylated F56AW68A Bora is predominantly bound to Aurora A in such a mixture, perhaps by NMR using labelled pS112 F56AW58A Bora and unlabeled S112 F56AW58A Bora?
      6. Please expand Figure 3A to better show the FW pocket-forming residues on Plk1.
      7. It would be helpful to label the peaks in the mass spectra in Fig. S11 with the phospho-species that they correspond to.
      8. In the last paragraph on page 7, "see we" in the sentence "As well as a decrease in intensity around pSer112 in Bora, see we an overall effect with decreased intensity across most of the Bora sequence." Should be corrected to "we see".
      9. While not required, it would be helpful if binding or Bora to Aurora A after Erk2 phosphorylation could be shown using fluorescence polarization or ITC to lend additional support to the NMR data for S112 and S59 phosphorylation and for CEP192 and TPX2 competition.
      10. The Aurora A phosphorylation motif has been further defined beyond that reported by the Pinna lab in 2005. Notably, the Ser-59 sequence on Bora (F-R-W-S-I), has, in addition to dominant selection for AR in the -2 position, both favorable -1 (W) and +1 (I) positions based on peptide library measurements (Alexander et al., Science Signaling 2011), further arguing that it may be an excellent Aurora A phosphorylation site.
      11. Have the authors tried to model the Drosophila melanogaster Aurora A-Bora-Polo complex to see if the Asn substitution of Bora Ser59, and the expected loss of the interactions between Bora pSer59 and Plk1 Arg59 and Aurora A Arg205 are compensated by other features?
      12. Given the relevance of the recent publication from Zhu et al. in https://doi.org/10.1038/s41467-025-63352-y to this study, the authors may want to comment on, or test, the relative importance of PKA and Aurora A as a potential kinase for Bora S59. While those authors argue that PKA phosphorylates Bora on Ser-59, one could easily imagine a model in which either PKA or Aurora A could initially phosphorylate that site followed by a propagation step after initial Aurora A activation, in which Aurora A phosphorylation of Bora Ser-59 is the dominant process.

      -Dan Lim and Michael Yaffe

      Significance

      The work is well done and clearly presented.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major comments:

      (comment #1)- It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).

      We acknowledge a slight reduction; however, this difference is not statistically significant (Fig S2c,e). We will quantify the levels of DDR markers upon TRF2 loss and exogenous DSBs and include it in the subsequent revision.

      (comment #2)-A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead). We have performed NSC-specific TRF2 ChIP-seq for an upcoming manuscript, which confirms TRF2 occupancy at multiple promoters of differentiation-associated genes. These data are provided solely for confidential evaluation by the designated reviewers.

      Regarding the ChIP-qPCR control experiments: We thank reviewer for pointing this out, indeed we included controls in our PCR assays as positive (telomeric) and TRF2-nonbinding loci (GAPDH, RPS18, and ACTB, based on HT1080 TRF2 ChIP-seq data) as negative controls. These results were not included earlier for clarity given that we were presenting several ChIP-PCR figures - in response to the comment we have included this now in the revised version (Fig. S3d,e). Gene expression analyses show selective upregulation of the TAN genes upon TRF2 loss (data normalised to GAPDH); whereas negative control genes lacking TRF2 binding (RPS18, ACTB) remain unchanged, ruling out non-specific effects. (Fig S3f,g,j,k).

      -(comment #3) A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?

      We confirm that DNase/RNase is routinely included in our pull-down experiments to exclude nucleic-acid bridging, with detailed methodology now elaborated in the Methods section. Not including this in the manuscript Methods was an oversight from our side. Our data demonstrate that only REST directly interacts with TRF2, while TRF2 engages PRC2 indirectly via REST, as also previously shown by us and others (page 6; ref. [62]; Sharma et al., ref. [15]).

      We thank the reviewer for noting the apparent differentiation in Fig. S5a. However, this observation represents rare spontaneous differentiation event and is not statistically significant (as shown in Fig S5b). Consistently, gene expression analysis of the TRF2-T188N mutant shows no significant change in TRF2-associated neuronal differentiation (TAN) genes. Therefore, Co-IP for TRF2-T188N with REST was not done.

      (comment #4) - The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      We clarify that Bis-indole carboxamide acts as a G4 stabilizer, while SMH14.6 is also a noted G4-binding ligand that stabilizes G4s (ref. [15]). The exclusion of TRF2 from G4 motifs in gene promoters by G4-binding ligands has also been documented previously (ref. [18]). In line with these findings, ChIP experiments performed following ligand treatment revealed a decreased occupancy of TRF2 at TAN gene promoters, supporting the proposed mechanism (added Fig. 6h).

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).

      Corrected

      • Fig S1h: The red box mentioned in the legend is not visible

      Corrected

      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l

      Corrected

      • The symbol γ of γH2AX is missing in the text

      Corrected

      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.

      Added SH-SY5Y in the legend of Fig. 3d.

      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.

      Corrected

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      1. For most of the data graphs in the manuscript, there is no indication of the number of independent biological replicates carried out (which should ideally be plotted as individual dots overlaying the column graphs), or what the error bars represent, or what statistical test was used. All the figure legends and methods have now been updated with the corresponding biological replicates per experiment, with error bars as SD/SEM and the corresponding statistical test along with p values.

      Figure S1.1a: needs a marker to show that the tissue is dentate gyrus.

      We acknowledge the reviewers' concern that high-magnification images alone make it difficult to verify whether the fields are taken from the correct anatomical location. The dentate gyrus (DG) of the hippocampus is a well-defined structure. In the revised figure (Fig S1.1a), we now include a low-magnification image showing the entire hippocampus, including the CA fields, along with two high-magnification fields specifically from the DG region. Consistent with our claim, the co-immunostaining demonstrates that Sox2-positive neural stem cells in the DG are also positive for TRF2.

      Figure 1c (and all other flow cytometry panels throughout the manuscript): it is not clear if the expression of any of these proteins, except maybe MAP2, are significantly different in the presence or absence of TRF2. These differences need to be presented more quantitatively, with the results compiled from multiple biological replicates and analysed statistically. I am not sure that flow cytometry is the best way to determine differences in protein expression levels for non-surface proteins, because many of the reported differences are not at all convincing.

      To detect intracellular/nuclear proteins by flow cytometry, cells were permeabilized using pre-chilled 0.2% Triton X-100 for 10 minutes, as described in the Methods section.

      We have revised the figures (Fig 1c,e) and now included statistical analysis from three independent biological replicates for these experiments.(Fig S1.4h-j, S2e, S6d)

      Fig 1d: has TRF2 been effectively silenced in this experiment? There appears to be just as many TRF2+ nuclei in the "TRF2 silenced" panel vs the control, including in the cells with neurite outgrowths.

      Quantification of nuclear levels of TRF2 showing decrease in nuclear TRF2 has been included in supplementary Fig S1g.

      Fig 2a-c: these experiments need a positive control, showing increased expression of these proteins in mNSC and SH-SY5Y cells in response to a DNA damaging agent. Again, flow cytometry may not be the best method for this; immunofluorescence combined with telomere FISH would be more convincing.

      We confirm that doxorubicin induces 53BP1 foci (IF-FISH Sup Fig. S2b) and TRF1 silencing elevates γH2AX (Sup Fig. S2c) validating DDR sensitivity. Unlike TRF2 loss (Fig. 2a-c), no TIFs appear with IF and telomere probes (Fig. 2d, Sup Fig. 2a), and without TIFs, there is no telomeric fusion. Flow cytometry was performed with Triton X- 100 to target nuclear protein. These findings adequately address the concern; therefore, further IF-FISH experiments were not included in the present study.

      To conclude that telomere damage is not occurring, an independent marker of such damage, such as telomere fusions, should also be measured.

      In response to uncapped telomeres, ATM kinase activates the DNA damage response (DDR), recruiting γH2AX and 53BP1 to telomeres, which precedes the end-to-end fusions (Takai et al., 2003; Maciejowski & de Lange, 2015; Takai et al., 2003; d'Adda di Fagagna et al., 2003; Cesare & Reddel, 2010; Hayashi et al., 2012; Sarek et al., 2015). We observe no DDR activation or foci (Fig. 2; Sup. Fig. 2). This absence of a DDR response and TIFs indicates no telomere uncapping, negating the need for direct telomere fusion analysis.

      Figure S2b is lacking a no-doxorubicin control.

      Untreated control has been included Fig. S2b.

      Figures 3a and 3b need a positive control (e.g. TRF2 binding to telomeric DNA) and a negative control (e.g. a promoter that did not show any TRF2 binding in the HT1080 ChiP-seq experiment in Fig S3).

      We have included positive (telomere) and negative (GAPDH) controls (based on HT1080 TRF2 ChIP-seq data) for the TRF2 ChIP assay in Supplementary Fig. S3d,e. Additionally, positive and negative controls for all ChIP experiments conducted in this study are presented in Supplementary Figs. S3d, S3e, S3h, S3i, S4c-h, and S5c-e

      The data in Figure 3 would be more compelling if all experiments were also performed in fibroblasts to confirm the cell-type specificity of the effect.

      Our HT1080 fibrosarcoma ChIP-seq data (ref. [18]; Sup. Fig. 3a,b) show TRF2 binding to TAN gene promoters in a fibroblast-derived model, with enrichment in neurogenesis-related genes (refs. [19,20]). In fibroblasts TRF2 depletion, as expected, induce telomere dysfunction and DDR (Fig. 2d; Sup. Fig. 2a), and eventually cell-cycle arrest and cell death as also reported earlier (van Steensel et al., 1998; Smogorzewska & de Lange, 2002). Therefore, the suggested experiments which would require sustained TRF2-depletion are not possible to perform in fibroblasts. TRF2 occupancy on the promoter of the genes in question in cells other than NSC was noted in HT1080 cells (ref. [18]; Sup. Fig. 3a,b).

      No references are provided for the TRF2 posttranslational modifications on R17, K176, K190 and T188. What is the evidence for these modifications, and is it known if they participate in the telomeric role of TRF2?

      These lines with references have been included in the manuscript (highlighted in blue).

      R17 methylation enhances telomere stability (66). K176/K190 acetylation stabilizes telomeres and is deacetylated by SIRT6 (67). T188 phosphorylation facilitates telomere repair after DSBs(68). These PTMs primarily support telomeric roles.

      The experiments in Fig 5 should also be performed with WT TRF2, to confirm that effects are not due to the overexpression of TRF2.

      WT TRF2 shows no differentiation phenotype and change in TAN gene expression (Fig. 1f,g; 3h, Sup Fig. 5a). Confirming effects are not due to TRF2 overexpression.

      Fig 5c has not been described in the text, and there are multiple technical problems with the TRF2 WT experiment: i) There appears to be significant background binding of REST to the IgG beads, though this blot has such high background it is hard to tell (the REST blot in Fig S4b is also of poor quality), ii) TRF2 is migrating at two different positions in the Input and IP lanes, and the TRF2 band in the K176R blot is at a different position to either, and iii) the relative loading of the Input and IP lanes is not indicated, so it's not clear why K176R appears to be so enriched in the IP.

      We acknowledge the oversight in not citing Fig 5c in the manuscript. This has been corrected, and, highlighted in blue in the revised manuscript.

      i) Multiple optimization attempts were made for the Co-IP experiments, and the presented figure reflects the best achievable result despite REST blot smearing, a pattern also reported previously (Ref. 65). The TRF2-REST interaction is well established, and a similar background was also observed in the cited study

      ii)Variable migration patterns of TRF2 were also noted in the cited study (Ref. 65), consistent with our observations. Our primary emphasis, however, is on the TRF2 K176R mutant, which clearly disrupts its interaction with REST.

      iii)The input loading corresponds to 10% of the total lysate. As the experiments were conducted independently, variations in transfection and pull-down efficiencies may account for observed differences.

      To rule out indirect effects of the G4 ligands on the results in Fig 6g, the binding of BG4 and TRF2 at the promoters of these genes should be measured by ChIP.

      To confirm that G4 ligand effects on TAN gene promoters are direct, TRF2 occupancy was assessed using ChIP. Significantly decreased occupancy of TRF2 was noted at TAN gene promoters, (added Fig. 6h). This implies that ligand-induced changes in TRF2 binding are directly linked to promoter-level G4 stabilization.

      Minor comments:

      1. The size of all the size markers in western blots should be added to the figures. Size has been included in all the western blots

      2. There are several figure panels that are incorrectly referenced in the text, e.g. Fig S1.1 (e-f) should be Fig S1.1 (e-h); Fig. 1m should be Fig. 1f; Figs 5e and 5f have been swapped.

      Corrected.

      1. Fig S1.4 is not referred to in the text. It is not clear what the purpose of Fig S1.4a is.

      The following line has been included in the manuscript highlighted in blue.

      Neurospheres were characterized using PAX6, a NSC marker (Fig S1.4a).

      Are the experiments in Figs 3e, 4a, 4c and 4e using 4-OHT treatment, or siRNA? If the latter, I don't think a control for the effectiveness of the knockdown in this cell type has been included anywhere in the manuscript.

      It is using siRNA, a western blot showing the effectiveness of knockdown is presented in supplementary figure S4c (now S4a).

      The lanes of the western blots in Fig S4c are not labelled.

      Corrected.

      1. Given that the experiments in Fig 5 were carried out on a background of endogenous WT TRF2 expression, presumably the K176R mutant is having a dominant negative effect. To understand the mechanism of this effect (e.g, is it simply due to replacement of endogenous WT TRF2 at its genomic binding sites by a large excess of exogenous K176R, or is dimerisation with WT TRF2 needed?) it would be helpful to know the relative expression levels of endogenous and K176R TRF2.

      To address the query, qRT-PCR with 3′ UTR-specific primers showed no change in endogenous TRF2 mRNA upon K176R expression in SH-SY5Y cells, while primers detecting total TRF2 revealed ~10-fold higher expression of K176R compared to control (Figure below). This indicates the absence of suppression of endogenous TRF2 mRNA. Given that the mutant's DNA binding is intact (Fig. 5f), the dominant-negative effect of K176R likely arises from overexpression of the exogenous mutant.

      For the sentence "...and critical for transcription factor binding including epigenetic functions that are G4 dependent" (bottom of page 3 of the PDF), the authors cite only their own prior papers, but there are examples from others that could be cited.

      We have incorporated citations from other research groups, now included as references 23-26.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript examines the effects of depletion of the telomeric protein TRF2 in mouse neural stem cells, using mice carrying a floxed allele of TRF2 and inducible Cre recombinase under the control of the stem cell-specific Nestin promoter. The results are also backed up in a human neuroblastoma cell line that has progenitor-like properties. There is no apparent induction of telomere damage in either of these cell types, but there is an increase in expression of neurogenesis genes. This is accompanied by an increase in binding of TRF2 to the relevant promoters, and evidence is provided that this binding involves G-quadruplexes in the promoters.

      On the whole, these core findings of this study are interesting, and reasonably robust. However, the study as a whole is marred by a large number of technical issues and missing controls which should be addressed prior to publication:

      1. For most of the data graphs in the manuscript, there is no indication of the number of independent biological replicates carried out (which should ideally be plotted as individual dots overlaying the column graphs), or what the error bars represent, or what statistical test was used.
      2. Figure S1.1a: needs a marker to show that the tissue is dentate gyrus.
      3. Figure 1c (and all other flow cytometry panels throughout the manuscript): it is not clear if the expression of any of these proteins, except maybe MAP2, are significantly different in the presence or absence of TRF2. These differences need to be presented more quantitatively, with the results compiled from multiple biological replicates and analysed statistically. I am not sure that flow cytometry is the best way to determine differences in protein expression levels for non-surface proteins, because many of the reported differences are not at all convincing.
      4. Fig 1d: has TRF2 been effectively silenced in this experiment? There appears to be just as many TRF2+ nuclei in the "TRF2 silenced" panel vs the control, including in the cells with neurite outgrowths.
      5. Fig 2a-c: these experiments need a positive control, showing increased expression of these proteins in mNSC and SH-SY5Y cells in response to a DNA damaging agent. Again, flow cytometry may not be the best method for this; immunofluorescence combined with telomere FISH would be more convincing.
      6. To conclude that telomere damage is not occurring, an independent marker of such damage, such as telomere fusions, should also be measured.
      7. Figure S2b is lacking a no-doxorubicin control.
      8. Figures 3a and 3b need a positive control (e.g. TRF2 binding to telomeric DNA) and a negative control (e.g. a promoter that did not show any TRF2 binding in the HT1080 ChiP-seq experiment in Fig S3).
      9. The data in Figure 3 would be more compelling if all experiments were also performed in fibroblasts to confirm the cell-type specificity of the effect.
      10. No references are provided for the TRF2 postranslational modifications on R17, K176, K190 and T188. What is the evidence for these modifications, and is it known if they participate in the telomeric role of TRF2?
      11. The experiments in Fig 5 should also be performed with WT TRF2, to confirm that effects are not due to the overexpression of TRF2.
      12. Fig 5c has not been described in the text, and there are multiple technical problems with the TRF2 WT experiment: i) There appears to be significant background binding of REST to the IgG beads, though this blot has such high background it is hard to tell (the REST blot in Fig S4b is also of poor quality), ii) TRF2 is migrating at two different positions in the Input and IP lanes, and the TRF2 band in the K176R blot is at a different position to either, and iii) the relative loading of the Input and IP lanes is not indicated, so it's not clear why K176R appears to be so enriched in the IP.
      13. To rule out indirect effects of the G4 ligands on the results in Fig 6g, the binding of BG4 and TRF2 at the promoters of these genes should be measured by ChIP.

      Minor comments:

      1. The size of all the size markers in western blots should be added to the figures.
      2. There are several figure panels that are incorrectly referenced in the text, e.g. Fig S1.1 (e-f) should be Fig S1.1 (e-h); Fig. 1m should be Fig. 1f; Figs 5e and 5f have been swapped.
      3. Fig S1.4 is not referred to in the text. It is not clear what the purpose of Fig S1.4a is.
      4. Are the experiments in Figs 3e, 4a, 4c and 4e using 4-OHT treatment, or siRNA? If the latter, I don't think a control for the effectiveness of the knockdown in this cell type has been included anywhere in the manuscript.
      5. The lanes of the western blots in Fig S4c are not labelled.
      6. Given that the experiments in Fig 5 were carried out on a background of endogenous WT TRF2 expression, presumably the K176R mutant is having a dominant negative effect. To understand the mechanism of this effect (e.g is it simply due to replacement of endogenous WT TRF2 at its genomic binding sites by a large excess of exogenous K176R, or is dimerisation with WT TRF2 needed?) it would be helpful to know the relative expression levels of endogenous and K176R TRF2.
      7. For the sentence "...and critical for transcription factor binding including epigenetic functions that are G4 dependent" (bottom of page 3 of the PDF), the authors cite only their own prior papers, but there are examples from others that could be cited.

      Significance

      The protein TRF2 was first identified as one of the core proteins that bind to the double-stranded region of telomeric DNA, and its many-faceted role in telomere protection has been well studied over the last 3 decades. More recent data from several labs indicate that TRF2 has additional roles outside the telomere, including in regulating gene expression, but these roles are so far much less characterised. Also, it has recently been shown that mouse ES cells, unexpectedly, do not require TRF2 for telomere protection (references 3 and 4 in this paper).

      The findings of the current findings expand the type of stem cells in which TRF2 is likely to be playing more of a role elsewhere in the genome, and not at telomeres, and hence is likely to be of high interest to both researchers of telomere biology, and those interested in the regulation of stem cell biology and neurogenesis.

      The strengths of the study are its novelty, its use of an inducible system to knock out TRF2 in the mouse neural stem cells of interest, and a thorough analysis of changes in gene expression and promoter occupancy across a range of genes of relevance to neurogenesis. The major weakness of the study, as descibed above, is the large number of technical problems, missing controls and missing indications of biological reproducibility.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors show that TRF2 binds non-telomeric G-quadruplexes in promoters of a set of genes ("TAN" genes for TRF2-associated neuronal differentiation) and recruits REST/chromatin remodelers to repress those genes in neural stem cells, thereby maintaining the NSC state in a telomere-independent manner. They show that the loss of TRF2 derepresses TAN genes and promotes neuronal differentiation.

      However, key experiments are missing to fully support the claims: a genome-wide TRF2 ChIP-seq in NSC to validate binding beyond a restricted set of TAN genes, more robust evidence confirming the absence of telomeric dysfunction, and mechanistic clarification of the effects of G4 ligands on TRF2 binding.

      Major comments:

      • It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).
      • A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead).
      • A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?
      • The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).
      • Fig S1h: The red box mentioned in the legend is not visible
      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l
      • The symbol  of H2AX is missing in the text
      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.
      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.
      • Fig S4b: the legends should be fixed, the figure shows TRF2 occupancy upon REST silencing and not the other way around.

      Significance

      Non-telomeric roles of TRF2 have been reported before: in repressing neuronal genes and promoting a stem-like state by stabilizing REST (PMID: 18818083), in promoter G4 binding and recruitment of chromatin repressors (previous studies from the same lab), and TRF2 was shown to be dispensable for telomere protection in pluripotent stem cells (ES). The novelty of the current study lies primarily in extending/combining these mechanisms to NSCs.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful and constructive feedback, which helped us strengthen the study on both the computational and biological side. In response, we added substantial new analyses and results in a total of 26 new supplementary figures and a new supplementary note. Importantly, we demonstrated that our approach generalizes beyond tissue outcomes by predicting final-timepoint morphology clusters from early frames with good accuracy as new Figure 4C. Furthermore, we completely restructured and expanded the human expert panel: six experts now provided >30,000 annotations across evenly spaced time intervals, allowing us to benchmark human predictions against CNNs and classical models under comparable conditions. We verified that morphometric trajectories are robust: PCA-based reductions and nearest-neighbor checks confirmed that patterns seen in t-SNE/UMAP are genuine, not projection artifacts. To test whether z-stacks are required, we re-did all analyses with sum- and maximum-intensity projections across five slices; results were unchanged, showing that single-slice imaging is sufficient. From a bioinformatics perspective, we performed negative-label baselines, downsampling analyses to quantify dataset needs, and statistical tests confirming CNNs significantly outperform classical models. Biologically, we clarified that each well contains one organoid, further introduced the Latent Determination Horizon concept tied to expert visibility thresholds, and discussed limits in cross-experiment transfer alongside strategies for domain adaptation and adaptive interventions. Finally, we clarified methods, corrected terminology and a scaler leak, and made all code and raw data publicly available.

      Together, these revisions in our opinion provide an even clearer, more reproducible, and stronger case for the utility of predictive modeling in retinal organoid development.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This study presents predictive modeling for developmental outcome in retinal organoids based on high-content imaging. Specifically, it compares the predictive performance of an ensemble of deep learning models with classical machine learning based on morphometric image features and predictions from human experts for four different task: prediction of RPE presence and lense presence (at the end of development) as well as the respective sizes. It finds that the DL model outperforms the other approaches and is predictive from early timepoints on, strongly indicating a time-frame for important decision steps in the developmental trajectory.

      Response: We thank the reviewer for the constructive and thoughtful feedback. In response to the review as found below, we have made substantial revisions and additions to the manuscript. Specifically, we clarified key aspects of the experimental setup, changed terminology regarding training/validation/test sets, and restructured our human expert baseline analysis by collecting and integrating a substantially larger dataset of expert annotations according to suggestion. We introduced the Latent Determination Horizon concept with clearer rationale and grounding. Most importantly, we significantly expanded our interpretability analyses across three CNN architectures and eight attribution methods, providing comprehensive quantitative evaluations and supplementary figures that extend beyond the initial DenseNet121 examples (new Supplementary Figures S29-S37). We also ensured full reproducibility by making both code and raw data publicly available with documentation. While certain advanced interpretability methods (e.g., Discover) could not be integrated despite considerable effort, we believe the revised manuscript presents a robust, well-documented, and carefully qualified analysis of CNN predictions in retinal organoid development.

      Major comments: I find the paper over-all well written and easy to understand. The findings are relevant (see significance statement for details) and well supported. However, I have some remarks on the description and details of the experimental set-up, the data availability and reproducibility / re-usability of the data.

      1. Some details about the experimental set-up are unclear to me. In particular, it seems like there is a single organoid per well, as the manuscript does not mention any need for instance segmentation or tracking to distinguish organoids in the images and associate them over time. Is that correct? If yes, it should be explicitly stated so. Are there any specific steps in the organoid preparation necessary to avoid multiple organoids per well? Having multiple organoids per well would require the aforementioned image analysis steps (instance segmentation and tracking) and potentially add significant complexity to the analysis procedure, so this information is important to estimate the effort for setting up a similar approach in other organoid cultures (for example cancer organoids, where multiple organoids per well are common / may not be preventable in certain experimental settings).

      Response: We thank the reviewer for this question. We agree that these preprocessing steps would add more complexity to our presented preprocessing steps and would definitely be required in some organoid systems. In our experimental setup, there is only one organoid per well which forms spontaneously after cell seeding from (almost) all seeded cells. There are no additional steps necessary in order to ensure this behaviour in our setup. We amended the Methods section to now explicitly state this accordingly (paragraph ‘Organoid timelapse imaging’).

      The terminology used with respect to the test and validation set is contrary to the field, and reporting the results on the test set (should be called validation set), should be avoided since it is used to select models. In more detail: the terms "test set" and "validation set" (introduced in 213-221) are used with the opposite meaning to their typical use in the deep learning literature. Typically, the validation set refers to a separate split that is used to monitor convergence / avoid overfitting during training, and the test set refers to an external set that is used to evaluate the performance of trained models. The study uses these terms in an opposite manner, which becomes apparent from line 624: "best performing model ... judged by the loss of the test set.". Please exchange this terminology, it is confusing to a machine learning domain expert. Furthermore, the performance on the test set (should be called validation set) is typically not reported in graphs, as this data was used for model selection, and thus does not provide an unbiased estimate of model performance. I would remove the respective curves from Figures 3 and 4.

      Response: We are thankful for the reviewers comments on this matter. Indeed, we were using an opposite terminology compared to what is commonly used within the field. We have adjusted the Results, Discussion and Methods sections as well as the figures accordingly. Further, we added a corresponding disclaimer for the code base in the github repository. However, we prefer to not remove the respective curves from the figures. We think that this information is crucial to interpret the variability in accuracy between organoids from the same experiments and organoids acquired from a different, independent experiment. The results suggest that the accuracy for organoids within the same experiments is still higher, indicating to users the potential accuracy drop resulting from independent experiments. As we think that this is crucial information for the interpretability of our results, we would like to still include it side-by-side with the test data in the figures.

      The experimental set-up for the human expert baseline is quite different to the evaluation of the machine learning models. The former is based on the annotation of 4,000 images by seven expert, the latter based on a cross-validation experiments on a larger dataset. First of all, the details on the human expert labeling procedure is very sparse, I could only find a very short description in the paragraph 136-144, but did not find any further details in the methods section. Please add a methods section paragraph that explains in more detail how the images were chosen, how they were assigned to annotators, and if there was any redundancy in annotation, and if yes how this was resolved / evaluated. Second, the fact that the set-up for human experts and ML models is quite different means that these values are not quite comparable in a statistical sense. Ideally, human estimators would follow the same set-up as in ML (as in, evaluate the same test sets). However, this would likely prohibitive in the required effort, so I think it's enough to state this fact clearly, for example by adding a comment on this to the captions of Figure 3 and 4.

      Response: We thank the reviewer for this constructive suggestion. We agree that the curves for human evaluations in the original draft were calculated differently compared to the curves for the classification algorithms, mostly stemming from feasibility of data set annotation at the time. In order to still address this suggestion, we went on to repeat and substantially expand the number of images annotated and thus revised the full human expert annotation. Each one of 6 human experts was asked to predict/interpret 6 images of each organoid within the full dataset. In order to select the images, we divided the time course (0-72h) into 6 evenly spaced intervals of 12 hours. For each interval, one image per organoid and human expert was randomly selected and assigned. This resulted in a total of 31,626 classified images (up from 4000 in the original version of the manuscript), from which the assigned images were overlapping between experts for each source interval but not for the individual images. We then changed the calculation of the curves to be the same as for the classification analysis: F1 data were calculated for each experiment over 6 timeframes and all experts, and plotted within the respective figure. We have amended the Methods section accordingly and replaced the respective curves within Figures 3 and 4 and Supplementary Figures S1, S8 and S19.

      It is unclear to me where the theoretical time window for the Latent Determination Horizon in Figure 5 (also mentioned in line 350) comes from? Please explain this in more detail and provide a citation for it.

      Response: We thank the reviewer for this important point. The Latent Determination Horizon (LDH) is a conceptual framework we introduced in this study to describe the theoretical period during which the eventual presence of a tissue outcome of interest (TOI) is being determined but not yet detectable. It is derived from two main observations in our dataset: (i) the inherent intra- and inter-experimental heterogeneity of organoid outcomes despite standardized protocols, and (ii) the progressive increase in predictive performance of our deep learning models over time, which suggests that informative morphological features only emerge gradually. We have now clarified this rationale in the manuscript (Discussion section) further and explicitly stated that the LDH is a concept we introduce here, rather than a previously described or cited term.

      The timewindow is defined by the TOI visibility, which is defined empirically as indicated by the results of our human expert panel (compare also Supplementary Figure S1).

      The intepretability analysis (Figure 4, 634-639) based on relevance backpropagation was performed based on DenseNet121 only. Why did you choose this model and not the ResNet / MobileNet? I think it is quite crucial to see if there are any differences between these model, as this would show how much weight can be put on the evidence from this analysis and I would suggest to add an additional experiment and supplementary figure on this.

      Response: We thank the reviewer for this important comment regarding the interpretability analysis and the choice of model. In the original submission, we restricted the attribution analyses shown in originial Figure 4C to DenseNet121, which served as our main reference model throughout the study. This choice was made primarily for clarity and to avoid redundancy in the main figures, as all three convolutional neural network (CNN) architectures (DenseNet121, ResNet50, MobileNetV3_Large) achieved comparable classification performance on our tasks.

      In response to the reviewer’s concern, we have now extended the interpretability analyses to include all three CNN architectures and a total of eight attribution methods (new Supplementary Note 1). Specifically, we generated saliency maps for DenseNet121, ResNet50, and MobileNetV3_Large across multiple time points and evaluated them using a systematic set of metrics: pairwise method agreement within each model (new Supplementary Figure S29), cross-model consistency per method (new Supplementary Figure S34), entropy and diffusion of saliencies over time (new Supplementary Figure S35), regional voting overlap across methods (new Supplementary Figure S36), and spatial drift of saliency centers of mass (new Supplementary Figure S37).

      These pooled analyses consistently showed that attribution methods differ markedly in the regions they prioritize, but that their relative behaviors were mostly stable across the three CNN architectures. For example, Grad-CAM and Guided Grad-CAM exhibited strong internal agreement and progressively focused relevance into smaller regions, while gradient-based methods such as DeepLiftSHAP and Integrated Gradients maintained broader and more diffuse relevance patterns but were the most consistent across models. Perturbation-based methods like Feature Ablation and Kernel SHAP often showed decreasing entropy and higher spatial drift, again similarly across architectures.

      To further address the reviewer’s point, we visualized the organoid depicted in original Figure 4C across all three CNNs and all eight attribution methods (new Supplementary Figures S30-S33). These comparisons confirm and extend analysis of the qualitative patterns described in original Figure 4C and show that they are not specific to DenseNet121, but are representative of the general behavior across architectures.

      In sum, we observed notable differences in how relevance was assigned and how consistently these assignments aligned. Highlighted organoid patterns were not consistent enough across attribution methods for us to be comfortable to base unequivocal biological interpretation on them. Nevertheless we believe that the analyses in response to the reviewer’s suggestions (new Supplementary Note 1 and new Supplementary Figures S29-S37) add valuable context to what can be expected from machine learning models in an organoid research setting.

      As we did not base further unequivocal biological claims on the relevance backpropagation, we decided to move the analyses to the Supporting Information and now show a new model predicting organoid morphology by morphometrics clustering at the final imaging timepoint in new Figure 4C in line with suggestions by Reviewer #3.

      The code referenced in the code availability statement is not yet present. Please make it available and ensure a good documentation for reproducibility. Similarly, it is unclear to me what is meant by "The data that supports the findings will be made available on HeiDoc". Does this only refer to the intermediate results used for statistical analysis? I would also recommend to make the image data of this study available. This could for example be done through a dedicated data deposition service such as BioImageArchive or BioStudies, or with less effort via zenodo. This would ensure both reproducibility as well as potential re-use of the data. I think the latter point is quite interesting in this context; as the authors state themselves it is unclear if prediction of the TOIs isn't even possible at an earlier point that could be achieved through model advances, which could be studied by making this data available.

      Response: We thank the reviewer for this comment. We have now made the repository and raw data public on the suggested platform (Zenodo) and apologize for this oversight. The links are contained within the github repository which is stated in the manuscript under “Data availability”.

      Minor comments:

      Line 315: Please add a citation for relevance backpropagation here.

      Response: We have included citations for all relevance backpropagation methods used in the paper.

      Line 591: There seems to be typo: "[...] classification of binary classification [...]"

      Response: Corrected as suggested.

      Line 608: "[...] where the images of individual organoids served as groups [...]" It is unclear to me what this means.

      Response: We wanted to express that organoid images belonging to one organoid were assigned in full to a training/validation set. We have now stated this more clearly in the Methods section.

      Reviewer #1 (Significance (Required)):

      General assessment: This study demonstrates that (retinal) organoid development can be predicted from early timepoints with deep learning, where these cannot be discerned by human experts or simpler machine learning models. This fact is very interesting in itself due to its implication for organoid development, and could provide a valuable tool for molecular analysis of different organoid populations, as outlined by the authors. The contribution could be strengthened by providing a more thorough investigation of what features in the image are predictive at early timepoints, using a more sophisticated approach than relevance backprop, e.g. Discover (https://www.nature.com/articles/s41467-024-51136-9). This could provide further biological insight into the underlying developmental processes and enhance the understanding of retinal organoid development.

      Response: We thank the reviewer for this assessment and suggestion. We agree that identifying image features predictive at early timepoints would add important biological context. We therefore attempted to apply Discover to our dataset. However, we were unable to get the system to run successfully. After considerable effort, we concluded that this approach could not be integrated into our current analysis. Instead, we report our substantially expanded results obtained with relevance backpropagation, which provided the most interpretable and reproducible insights for our study as described above (New Supplementary Note 1, new Supplementary Figures S29-S37).

      Advance: similar studies that predict developmental outcome based on image data, for example cell proliferation or developmental outcome exist. However, to the best of my knowledge, this study is the first to apply such a methodology to organoids and convincingly shows is efficacy and argues is potential practical benefits. It thus constitutes a solid technical advance, that could be especially impactful if it could be translated to other organoid systems in the future.

      Response: We thank the reviewer for this positive assessment of our work and for highlighting its novelty and potential impact. We are encouraged that the reviewer recognizes the value of applying predictive modeling to organoids and the opportunities this creates for translation to other organoid systems.

      Audience: This research is of interest to a technical audience. It will be of immediate interest to researchers working on retinal organoids, who could adapt and use the proposed system to support experiments by better distinguishing organoids during development. To enable this application, code and data availability should be ensured (see above comments on reproducibility). It is also of interest to researchers in other organoid systems, who may be able to adapt the methodology to different developmental outcome predictions. Finally, it may also be of interest to image analysis / deep learning researchers as a dataset to improve architectures for predictive time series modeling.

      My research background: I am an expert in computer vision and deep learning for biomedical imaging, especially in microscopy. I have some experience developing image analysis for (cancer) organoids. I don't have any experience on the wet lab side of this work.

      Response: We thank the reviewer for this encouraging feedback and for recognizing the broad relevance of our work across retinal organoid research, other organoid systems, and the image analysis community. We are pleased that the potential utility of our dataset and methodology is appreciated by experts in computer vision and biomedical imaging. We have now made the repository and raw data public and apologize for this oversight. The links are provided in the manuscript under “Data availability”.

      Constantin Pape


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Response: We thank the reviewer for the detailed and constructive feedback, which has greatly improved the clarity and rigor of our manuscript. In response, we have corrected a potential data leakage issue, re-ran the affected analyses, and confirmed that results remain unchanged. We clarified the use of data augmentation in CNN training, tempered some claims throughout the text, and provided stronger justification for our discretization approach together with new supplementary analyses (New Supplementary Figures S26, S27). We substantially expanded our interpretability analyses across three CNN architectures and eight attribution methods, quantified their consistency and differences (new Supplementary Figures S29, S34-S37, new Supplementary Note 1), and added comprehensive visualizations (New S30-S33). We also addressed technical artifact controls, provided downsampling analyses to support our statement on sample size sufficiency (new Supplementary Figure S28), and included negative-control baselines with shuffled labels in Figures 3 and 4. Furthermore, we improved the clarity of terminology, figures, and methodological descriptions, and we have now made both code and raw data publicly available with documentation. Together, we believe these changes further strengthen the robustness, reproducibility, and interpretability of our study while carefully qualifying the claims.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Response: We thank the reviewer for raising these important methodological points. As Reviewer #1 correctly noted, our use of the terms validation and test may have contributed to confusion. To clarify: in the original analysis the scalers were fitted on the training and validation data and then applied to the test data. This indeed constitutes a form of data leakage. We have corrected the respective code, re-ran all analyses that were potentially affected, and did not observe any meaningful change in the reported results. The Methods section has been amended to clarify this important detail.

      For the neural networks, each image was normalized independently (per image), without using dataset-level statistics, thereby avoiding any risk of data leakage.

      Regarding data augmentation, the convolutional neural network was indeed trained with augmentations. Early experiments without augmentation led to severe overfitting, confirming that the performance advantage would not hold without artificially increasing the effective sample size. We have added a clarifying statement in the Methods section to make this explicit.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Response: We believe our additionally performed computational experiments qualify all the claims we make in the revised version of the manuscript.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that regression frameworks can, in principle, preserve the full resolution of continuous outcome variables. However, in our setting we deliberately chose a discretization approach. First, the discretized outcome categories correspond to ranges of tissue sizes that are biologically meaningful and allow direct comparison to expert annotations. In practice, human experts also tend to judge tissue presence and size in categorical rather than strictly continuous terms, which was mirrored by our human expert annotation strategy. As we aimed to compare deep learning with classical machine learning models and with expert annotations across the same prediction tasks, a categorical outcome formulation provided the most consistent and fair framework. Secondly, the underlying outcome variables did not follow a normal distribution, but instead exhibited a skewed and heterogeneous spread. Regression models trained on such distributions often show biases toward the most frequent value ranges, which may obscure less common but biologically important outcomes. Discretization mitigated this issue by balancing the prediction task across defined size categories.

      In line with the reviewer’s request, we have now analyzed the performance in relation to the distance of each sample from the bin center. These results are provided as new Supplementary Figures S26 and S27. Interestingly, for the classical machine learning classifiers, F1 scores tended to be somewhat higher for samples close to bin edges. For the convolutional neural networks, however, F1 scores were more evenly distributed across distances from bin centers. While the reason for this difference remains unclear, the analysis demonstrates that the discretization did not obscure predictive signals in either framework. We have amended the results section accordingly.

      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.

      Response: We thank the reviewer for raising these important concerns. In the initial version we showed examples of relevance backpropagation that suggested CNNs rely on visible RPE or lens tissue for their predictions (original Figure 4C). Following the reviewer’s comment, we expanded the analysis extensively across all models and attribution methods (compare new Supplementary Note 1), and quantified agreement, consistency, entropy, regional overlap, and drift (new Supplementary Figures S29 and S34-S37), as well as providing comprehensive visualizations across models and methods (new Supplementary Figures S30-S33).

      This extended analysis showed that attribution methods behave very differently from each other, but consistently so across the three CNN architectures. Each method displayed characteristic patterns, for example in entropy or center-of-mass drift, but the overlap between methods was generally low. While integrated gradients and DeepLiftSHAP tended to concentrate on tissue regions, other methods produced broader or shifting relevance patterns, and overall we could not establish robust or interpretable signals from a biological point of view that would support stronger conclusions.

      We have therefore revised the text to focus on descriptive results only, without making claims about early structural information or tissue-specific cues being used by the networks. We also added missing scale bars and clarified methodological details. Together, the revised section now reflects the extensive work performed while remaining cautious about what can and cannot be inferred from saliency methods in this setting.

      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?

      Response: We thank the reviewer for this comment. We have not performed any adjustment beyond manual quality control post organoid seeding. The aforementioned removal of technical artifacts included, among others, seeding at the same time of day, seeding and cell processing by the same investigator according to a standardized protocol, usage of reproducible chemicals (same LOT, frozen only once, etc.) and temperature control during image acquisition. We adhered strictly to internal, previously published workflows that were aimed to reduce any variability due to technical variations during cell harvesting, organoid preparation and imaging. We have clarified this important point in the Methods section.

      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Response: We thank the reviewer for this important comment. To clarify, our statement regarding the sufficiency of ~500 organoids was based on a downsampling-style analysis we had already performed. In this analysis, we systematically reduced the number of experiments used for training and assessed predictive performance for both CNN- and classifier-based approaches (former Supplementary Figure S11, new Supplementary Figure S28). For CNNs, performance curves plateaued at approximately six experiments (corresponding to ~500 organoids), suggesting that increasing the sample size further only marginally improved prediction accuracy. In contrast, we did not observe a clear plateau for the machine learning classifiers, indicating that these models can achieve comparable performance with fewer training experiments. We have revised the manuscript text to clarify that this conclusion is derived from these analyses, and continue to include Supplementary Figure S11 as new Supplementary Figure S28 for transparency (compare Supplementary Note 1).

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Response: We confirm that the suggested experiments are realistic in terms of time and resources and have been able to complete them within 6 months.

      Are the data and the methods presented in such a way that they can be reproduced? No, the code is not currently available. We were not able to review the source code.

      Response: We have now made the repository public. We apologize for this initial oversight. The links are provided in the revised version of the manuscript under “Data availability”.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.

      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Response: We thank the reviewer for this comment. We have calculated the respective curves with neural networks and machine learning classifiers that were trained on data with shuffled labels and have included these results as a separate curve in the respective Figures 3 and 4. We have also amended the Methods section accordingly.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Response: We thank the reviewer for highlighting the need to clarify terminology. We have revised the manuscript accordingly. Specifically, we now explicitly define comprehensive dataset as longitudinal brightfield imaging of ~1,000 organoids from 11 independent experiments, imaged every 30 minutes over several days, covering a wide range of developmental outcomes at high temporal resolution. Furthermore, we replaced the term significantly with wording that avoids implying statistical significance, where appropriate. We have clarified the morphometrics feature space in the Methods section in a more detailed fashion, describing the custom parameters that we used to enhance the regionprops_table function of skimage.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions? - Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?

      Response: We thank the reviewer for pointing out this potential source of confusion. The distances shown in original Figures 2C and 2D were not calculated in tSNE space. Instead, morphometrics features were first Z-scaled, and then dimensionality reduction by PCA was applied, with the first 20 principal components retaining ~93% of the variance. Euclidean distances were subsequently computed in this 20-dimensional PC space. For inter-organoid distances (Figure 2C), we calculated mean pairwise Euclidean distances between all organoids at each imaging time point, capturing the global divergence of organoid morphologies over time in an experiment-specific manner. For intra-organoid distances (Figure 2D), we calculated Euclidean distances between consecutive time points (n vs. n+1) for each individual organoid, thereby quantifying the extent of morphological change within organoids over time. We have revised the Figure legend and Methods section to make these definitions clearer.

      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.

      Response: We thank the reviewer for this comment. In our initial machine learning analyses, we systematically benchmarked a broad set of classifiers on the morphometrics feature space, using cross-validation and hyperparameter tuning where appropriate. The classifiers that we ultimately focused on were those that consistently achieved the best performance in these comparisons. This process is described in the Methods and summarized in the Supplementary Figures S4 and S15 (for sum- and maximum-intensity z-projections new Supplementary Figures S5/6 and S16/17), which show the results of the benchmarking. We have clarified the text to state that the selected classifiers were chosen on the basis of their superior performance in these evaluations.

      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating?

      Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.

      Response: We thank the reviewer for these thoughtful questions. The retinal organoids in our study were embedded in low concentrations of Matrigel and remained relatively stable in position throughout imaging. We did not observe substantial displacement or lateral movement of organoids, and no systematic rotation could be detected in our dataset. Small morphological rearrangements within organoids were observed, but the gross positioning of organoids within the wells remained consistent across time-lapse recordings.

      Regarding generalization across laboratories, we agree with the reviewer that this is an important limitation. While we minimized technical variability by adhering to a highly standardized, published protocol (see Methods), considerable heterogeneity remained at both intra- and inter-experimental levels. This variability likely reflects inherent properties of the system, similar the reportings in the literature across organoid systems, rather than technical artifacts, and poses a potential challenge for applying our models to independently generated datasets. We therefore highlight the need for future work to test the robustness of our models across laboratories, which will be essential to determine the true generalizability of our approach. We have amended the Discussion accordingly.

      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.

      Response: We thank the reviewer for this comment. We agree that the individual image analysis steps we used, such as morphometric feature extraction, are based on well-established algorithms. By referring to “advanced image analysis,” we intended to highlight not the novelty of each single algorithm, but rather the way in which we systematically combined a large number of quantitative parameters and leveraged them through machine learning models to generate predictive insights into organoid development.

      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.

      Response: We thank the reviewer for pointing out this ambiguity. To clarify, the ground truth definition at the final time point was established by two experts who annotated all organoids. These two annotators were part of the larger group of six experts who contributed to the earlier human expert annotation tasks. Thus, while six experts provided annotations for subsets of images during the expert prediction experiments, the final annotation for every single organoid at its last time frame was consistently performed by the same two experts to ensure a uniform ground truth. We have amended this in the revised manuscript to make this distinction clear.

      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?

      Response: We thank the reviewer for this comment. To clarify, we extracted 165 morphometric features per segmented organoid, combining standard scikit-image region properties with custom implementations (e.g., blur quantified as the variance of the Laplace filter response within the organoid mask). All metrics, including blur, were calculated per segmented organoid rather than per full field of view. This broad feature space was deliberately chosen to capture size, shape, and intensity distributions in a comprehensive and unbiased manner. We now provide a more detailed description of the preprocessing steps, the full feature list, and the exact code implementations are provided in the Methods section (“Large-scale time-lapse Image analysis”) of the revised version of the manuscript as well as in the source code github repository.

      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?

      Response: We thank the reviewer for this comment. The reported image count includes slice 3 only, which we based our models on. The five z-slices that we used to create the MAX- and SUM-intensity z-projections would increase this number 5-fold. While we agree that the number of organoids and time points are highly informative metrics and have provided these details in the manuscript, we also believe that reporting the image count is valuable, as it directly reflects the size of the dataset processed by our analysis pipelines. For this reason, we prefer to keep the current description.

      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?

      Response: We thank the reviewer for this valuable suggestion. To address this point, we repeated all analyses using both sum- and maximum-intensity z-projections and have included the results as new Supplementary Figures S8-S10, S13/S14 for TOI emergence and new Supplementary Figures S19-S21, S24/S25 for TOI sizes (classifier benchmarking and hyperparameter tuning in new Supplementary Figures S5/S6 and S16/S17). These additional analyses did not reveal a noticeable improvement in performance, suggesting that projections incorporating all slices are not strictly necessary in our setting. An analysis that included all five z-slices separately for classification would indeed be of interest, but was not feasible within the scope of this study, as it would substantially increase the computational demands beyond the available resources and timeframe.

      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Response: We thank the reviewer for raising this important point. The current study relied on expert visual review, which is time-intensive, but our findings suggest several ways to streamline future work. For instance, model-assisted prelabeling could be used to automatically accept high-confidence cases while routing only uncertain cases to experts. Active sampling strategies, focusing expert review on boundary cases or rare classes, as well as programmatic checks from morphometrics (e.g., blur or contrast to flag low-quality frames), could further reduce effort. Consensus annotation could be reserved only for cases where the model and expert disagree or confidence is low. Finally, new experiments could be bootstrapped with a small seed set of annotated organoids for fine-tuning before switching to such a model-assisted workflow. These possibilities are enabled by our approach, where organoids are imaged individually, morphometrics provide automated quality indicators, and the CNN achieves reliable performance at early developmental stages, making model-in-the-loop annotation a feasible and efficient strategy for future studies. We have added a clarifying paragraph to the Discussion accordingly.

      Reviewer #2 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.

      • The work uses standard convolutional neural networks.

      Response: We thank the reviewer for this assessment. We agree that our work represents one of the early attempts in this direction, applying a straightforward pipeline with standard convolutional neural networks, and we appreciate the reviewer’s acknowledgment of how the study has been placed in context within the Introduction.

      State what audience might be interested in and influenced by the reported findings. - Data scientists performing image-based profiling for time lapse imaging of organoids.

      • Retinal organoid biologists

      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Response: We thank the reviewer for outlining the relevant audiences. We agree that the reported findings will be of interest to data scientists working on image-based profiling, retinal organoid biologists, and more broadly to organoid researchers facing long culture times with uncertain developmental outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. - Image-based profiling/morphometrics

      • Organoid image analysis

      • Computational biology

      • Cell biology

      • Data science/machine learning

      • Software

      This is a signed review:

      Gregory P. Way, PhD

      Erik Serrano

      Jenna Tomkinson

      Michael J. Lippincott

      Cameron Mattson

      Department of Biomedical Informatics, University of Colorado


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript by Afting et. al. addresses the challenge of heterogeneity in retinal organoid development by using deep learning to predict eventual tissue outcomes from early-stage images. The central hypothesis is that deep learning can forecast which tissues an organoid will form (specifically retinal pigmented epithelium, RPE, and lens) well before those tissues become visibly apparent. To test this, the authors assembled a large-scale time-lapse imaging dataset of ~1,000 retinal organoids (~100,000 images) with expert annotations of tissue outcomes. They characterized the variability in organoid morphology and tissue formation over time, focusing on two tissues: RPE (which requires induction) and lens (which appears spontaneously). The core finding is that a deep learning model can accurately predict the emergence and size of RPE and lens in individual organoids at very early developmental stages. Notably, a convolutional neural network (CNN) ensemble achieved high predictive performance (F1-scores ~0.85-0.9) hours before the tissues were visible, significantly outperforming human experts and classical image-analysis-based classifiers. This approach effectively bypasses the issue of stochastic developmental heterogeneity and defines an early "determination window" for fate decisions. Overall, the study demonstrates a proof-of-concept that artificial intelligence can forecast organoid differentiation outcomes non-invasively, which could revolutionize how organoid experiments are analyzed and interpreted.

      Recommendation:

      While this manuscript addresses an important and timely scientific question using innovative deep learning methodologies, it currently cannot be recommended for acceptance in its present form. The authors must thoroughly address several critical limitations highlighted in this report. In particular, significant issues remain regarding the generalizability of the predictive models across different experimental conditions, the interpretability of deep learning predictions, and the use of Euclidean distance metrics in high-dimensional morphometric spaces-potentially leading to distorted interpretations of organoid heterogeneity. These revisions are essential for validating the general applicability of their approach and enhancing biological interpretability. After thoroughly addressing these concerns, the manuscript may become suitable for future consideration.

      Response: We thank the reviewer for the thoughtful and constructive comments. In response, we expanded our analyses in several key ways. We clarified limitations regarding external datasets. Interpretability analyses were greatly extended across three CNN architectures and eight attribution methods (new Supplementary Figures S29-S37, new Supplementary Note 1), showing consistent but method-specific behaviors; as no reproducible biologically interpretable signals emerged, we now present these results descriptively and clearly state their limitations. We further demonstrated the flexibility of our framework by predicting morphometric clusters in addition to tissue outcomes (new Figure 4C), confirmed robustness of the morphometrics space using PCA and nearest-neighbor analyses (new Supplementary Figure S3), and added statistical tests confirming CNNs significantly outperform classical classifiers (Supplementary File 1). Finally, we made all code and raw data publicly available, clarified species context, and added forward-looking discussion on adaptive interventions. We believe these revisions now further improve the rigor and clarity of our work.

      Major Issues (with Suggestions):

      1. Generalization to Other Batches or Protocols: The drop in performance on independent validation experiments suggests the model may partially overfit to specific experimental conditions. A major concern is how well this approach would work on organoids from a different batch or produced by a slightly different differentiation protocol. Suggestion: The authors should clarify the extent of variability between their "independent experiment" and training data (e.g., were these done months apart, with different cell lines or minor protocol tweaks?). To strengthen confidence in the model's robustness, I recommend testing the trained model on one or more truly external datasets, if available (for instance, organoids generated in a separate lab or under a modified protocol). Even a modest analysis showing the model can be adapted (via transfer learning or re-training) to another dataset would be valuable. If new data cannot be added, the authors should explicitly discuss this limitation and perhaps propose strategies (like domain adaptation techniques or more robust training with diverse conditions) to handle batch effects in future applications.

      Response: We thank the reviewer for this important comment. We fully agree with the reviewer that this would be an amazing addition to the manuscript. Unfortunately we are not able to obtain the requested external data set. Although retinal organoid systems exist and are widely used across different species lines, to the best of our knowledge our laboratory is the only one currently raising retinal organoids from primary embryonic pluripotent stem cells of Oryzias latipes and there is currently only one known (and published) differentiation protocol which allows the successful generation of these organoids. We note that our datasets were collected over the course of nine months, which already introduces variability across time and thus partially addresses concerns regarding batch effects. While we did not have access to truly external datasets (e.g., from other laboratories), we have clarified this limitation as suggested in the revised version of the manuscript and outlined strategies such as domain adaptation and training on more diverse conditions as promising future directions to improve robustness.

      Biological Interpretation of Early Predictive Features: The study currently concludes that the CNN picks up on complex, non-intuitive features that neither human experts nor conventional analysis could identify. However, from a biological perspective, it would be highly insightful to know what these features are (e.g., subtle texture, cell distribution patterns, etc.). Suggestion: I encourage the authors to delve deeper into interpretability. They might try complementary explainability techniques (for example, occlusion tests where parts of the image are masked to see if predictions change, or activation visualization to see what patterns neurons detect) beyond GradientSHAP. Additionally, analyzing false predictions might provide clues: if the model is confident but wrong for certain organoids, what visual traits did those have? If possible, correlating the model's prediction confidence with measured morphometrics or known markers (if any early marker data exist) could hint at what the network sees. Even if definitive features remain unidentified, providing the reader with any hypothesis (for instance, "the network may be sensing a subtle rim of pigmentation or differences in tissue opacity") would add value. This would connect the AI predictions back to biology more strongly.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that linking CNN predictions to specific biological features would be highly valuable. In response, we expanded our interpretability analyses beyond GradientSHAP to a broad set of attribution methods and quantified their behavior across models and timepoints (new Supplementary Figures S29-S37, new Supplementary Note 1). While some methods (e.g., Integrated Gradients, DeepLiftSHAP) occasionally highlighted visible tissue regions, others produced diffuse or shifting relevance, and overall overlap was low. Therefore, our results did not yield reproducible, interpretable biological signals.

      Given these results, we have refrained from speculating about specific early image features and now present the interpretability analyses descriptively. We agree that future studies integrating imaging with molecular markers will be required to directly link early predictive cues to defined biological processes.

      Expansion to Other Outcomes or Multi-Outcome Prediction: The focus on RPE and lens is well-justified, but these are two outcomes within retinal organoids. A major question is whether the approach could be extended to predict other cell types or structures (e.g., presence of certain retinal neurons, or malformations) or even multiple outcomes at once. Suggestion: The authors should discuss the generality of their approach. Could the same pipeline be trained to predict, say, photoreceptor layer formation or other features if annotated? Are there limitations (like needing binary outcomes vs. multi-class)? Even if outside the scope of this study, a brief discussion would reassure readers that the method is not intrinsically limited to these two tissues. If data were available, it would be interesting to see a multi-label classification (predict both RPE and lens presence simultaneously) or an extension to other organoid systems in future. Including such commentary would highlight the broad applicability of this platform.

      Response: We thank the reviewer for this helpful and important suggestion. While our study focused on RPE and lens as the most readily accessible tissues of interest in retinal organoids, our new analyses demonstrate that the pipeline is not limited to these outcomes. In addition to tissue-specific predictions, we trained both a convolutional neural network (on image data) and a decision tree classifier (on morphometrics features) to predict more abstract morphological clusters defined at the final timepoint using the morphometrics features, showing that both approaches could successfully capture non-tissue features from early frames (new Figure 4C). This illustrates that the framework can be extended beyond binary tissue outcomes to multi-class problems, and predict relevant outcomes like the overall organoid morphology. Given appropriate annotations, the framework could in principle be trained to detect additional structures such as photoreceptor layers or malformations. Furthermore, the CNN architecture we employed and the morphometrics feature space are compatible with multi-label classification, meaning simultaneous prediction of several outcomes would also be feasible. We have clarified this point in the discussion to highlight the methodological flexibility and potential generality of our approach and are excited to share this very interesting, additional model with the readership.

      Curse of high dimensionality: Using Euclidean distance in a 165-dimensional morphometric space likely suffers from the curse of dimensionality, which diminishes the meaning of distances as dimensionality increases. In such high-dimensional settings, the range of pairwise distances tends to collapse, undermining the ability to discern meaningful intra- vs. inter-organoid differences. Suggestion: To address this, I would encourage the authors to apply principal component analysis (PCA) in place of (or prior to) tSNE. PCA would reduce the data to a few dominant axes of variation that capture most of the morphometric variance, directly revealing which features drive differences between organoids. These principal components are linear combinations of the original 165 parameters, so one can examine their loadings to identify which morphometric traits carry the most information - yielding interpretable axes of biological variation (e.g., organoid size, shape complexity, etc.). In addition, I would like to mention an important cautionary remark regarding tSNE embeddings. tSNE does not preserve global geometry of the data. Distances and cluster separations in a tSNE map are therefore not faithful to the original high-dimensional distances and should be interpreted with caution. See Chari T, Pachter L (2023), The specious art of single-cell genomics, PLoS Comput Biol 19(8): e1011288, for an enlightening discussion in the context of single cell genomics. The authors have shown that extreme dimensionality reduction to 2D can introduce significant distortions in the data's structure, meaning the apparent proximity or separation of points in a tSNE plot may be an artifact of the algorithm rather than a true reflection of morphometric similarity. Implementing PCA would mitigate high-dimensional distance issues by focusing on the most informative dimensions, while also providing clear, quantitative axes that summarize organoid heterogeneity. This change would strengthen the analysis by making the results more robust (avoiding distance artifacts) and biologically interpretable, as each principal component can be traced back to specific morphometric features of interest.

      Response: We thank the reviewer for this mention. Indeed, high dimensionality and dimensionality reductions can lead to false interpretations. We approached this issue as follows: First, we calculated the same TSNE projections and distances using the first 20 PCs and supplied these data as the new Figure 2 and new Supplementary Figure 2. While the scale of the data shifted slightly, there were no differences in the data distribution that would contradict our prior conclusions.

      In order to confirm the findings and further emphasize the validity of our dimensionality reduction, we calculated the intersection of 30 nearest neighbors in raw data space (or pca space) compared and 30 nearest neighbors in reduced space (TSNE or UMAP, as we wanted to emphasize that this was not an effect specific for TSNE projections and would also be valid in a dimensionality reduction which is more known to preserve global structure rather than local structure). As shown in the new Supplementary Figure S3 (A-D), the high jaccard index confirmed that our projections accurately reflect the data’s structure obtained from raw distance measurements. Moreover, the jaccard index generally increased over time, which is best explained by a stronger morphological similarity of organoids at timepoint 0 and reflected by the dense point cloud in the TSNE projections at that timepoint. The described effects were independent of the usage of data derived from 20 PCs versus data derived from all 165 dimensions.

      We next wanted to confirm the conclusion that data points obtained from organoids at later timepoints were more closely related to each other than data points from different organoids. We therefore identified the 30 nearest neighbor data points, showing that at later timepoints these 30 nearest neighbor data points were almost all attributable to the same organoid (new Supplementary Figure S3 E/F). This was only not the case for experiments that lacked in between timepoints (E007 and E002), therefore misaligning the organoids in the reduced space and convoluting the nearest neighbor analysis.

      We have included the respective new Figures and new Supplementary Figures and linked them in the main manuscript.

      Statistical Reporting and Significance: The manuscript focuses on F1-score as the metric to report accuracy over time, which is appropriate. However, it's not explicitly stated whether any statistical significance tests were performed on the differences between methods (e.g., CNN vs human, CNN vs classical ML). Suggestion: The authors could report statistical significance of the performance differences, perhaps using a permutation test or McNemar's test on predictions. For example, is the improvement of the CNN ensemble over the Random Forest/QDA classifier statistically significant across experiments? Given the n of organoids, this should be assessable. Demonstrating significance would add rigor to the analysis.

      Response: We thank the reviewer for this helpful suggestion. Following the recommendation, we quantified per-experiment differences in predictive performance by calculating the area under the F1-score curves (AUC) for each classifier and experiment. We then compared methods using paired Wilcoxon signed-rank tests across experiments, with Holm-Bonferroni correction for multiple comparisons. This analysis confirmed that the CNN consistently and significantly outperformed the baseline models and classical machine learning classifiers in validation and test organoids, while CNNs were notably but not significantly better performing in test organoids for RPE area and lens sizes compared to the machine learning classifiers. In summary, the findings add the requested statistical rigor to our findings. The results of these tests are now provided in the Supplementary Material as Supplementary File 1.

      Minor Issues (with Suggestions):

      1. Data Availability: Given the resource-intensive nature of the work, the value to the community will be highest if the data is made publicly available. I understand that this is of course at the behest of the authors and they do mention that they will make the data available upon publication of the manuscript. For the time being, the authors can consider sharing at least a representative subset of the data or the trained model weights. This will allow others to build on their work and test the method in other contexts, amplifying the impact of the study.

      Response: We have now made the repository and raw data public and apologize for this oversight. The link for the github repository is now provided in the manuscript under “Data availability”, while the links for the datasets are contained within the github repository.

      Discussion - Future Directions: The Discussion does a good job of highlighting applications (like guiding molecular analysis). One minor addition could be speculation on using this approach to actively intervene: for example, could one imagine altering culture conditions mid-course for organoids predicted not to form RPE, to see if their fate can be changed? The authors touch on reducing variability by focusing on the window of determination; extending that thought to an experimental test (though not done here) would inspire readers. This is entirely optional, but a sentence or two envisioning how predictive models enable dynamic experimental designs (not just passive prediction) would be a forward-looking note to end on.

      Response: We thank the reviewer for this constructive suggestion. We have expanded the discussion to briefly address how predictive modeling could go beyond passive observation. Specifically, we now discuss that predictive models may enable dynamic interventions, such as altering culture conditions mid-course for organoids predicted not to form RPE, to test whether their developmental trajectory can be redirected. While outside the scope of the present work, this forward-looking perspective emphasizes how predictive modeling could inspire adaptive experimental strategies in future studies.

      I believe with the above clarifications and enhancements - especially regarding generalizability and interpretability - the paper will be suitable for broad readership. The work represents an exciting intersection of developmental biology and AI, and I commend the authors for this contribution.

      Response: We thank the reviewer for the positive assessment and their encouraging remarks regarding the contribution of our work to these fields.

      Novelty and Impact:

      This work fills an important gap in organoid biology and imaging. Previous studies have used deep learning to link imaging with molecular profiles or spatial patterns in organoids, but there remained a "notable gap" in predicting whether and to what extent specific tissues will form in organoids. The authors' approach is novel in applying deep learning to prospectively predict organoid tissue outcomes (RPE and lens) on a per-organoid basis, something not previously demonstrated in retinal organoids. Conceptually, this is a significant advance: it shows that fate decisions in a complex 3D culture model can be predicted well in advance, suggesting the existence of subtle early morphogenetic cues that only a sophisticated model can discern. The findings will be of broad interest to researchers in organoid technology, developmental biology, and biomedical AI.

      Response: We thank the reviewer for this thoughtful and encouraging assessment. We agree that our study addresses an important gap by prospectively predicting tissue outcomes at the single-organoid level, and we appreciate the recognition that this represents a conceptual advance with relevance not only for retinal organoids but also for broader applications in organoid biology, developmental biology, and biomedical AI.

      Methodological Rigor and Technical Quality:

      The study is methodologically solid and carefully executed. The authors gathered a uniquely large dataset under consistent conditions, which lends statistical power to their analyses. They employ rigorous controls: an expert panel provided human predictions as a baseline, and a classical machine learning pipeline using quantitative image-derived features was implemented for comparison. The deep learning approach is well-chosen and technically sound. They use an ensemble of CNN architectures (DenseNet121, ResNet50, and MobileNetV3) pre-trained on large image databases, fine-tuning them on organoid images. The use of image segmentation (DeepLabV3) to isolate the organoid from background is appropriate to ensure the models focus on the relevant morphology. Model training procedures (data augmentation, cross-entropy loss with class balancing, learning rate scheduling, and cross-validation) are thorough and follow best practices. The evaluation metrics (primarily F1-score) are suitable for the imbalanced outcomes and emphasize prediction accuracy in a biologically relevant way. Importantly, the authors separate training, test, and validation sets in a meaningful manner: images of each organoid are grouped to avoid information leakage, and an independent experiment serves as a validation to test generalization. The observation that performance is slightly lower on independent validation experiments underscores both the realism of their evaluation and the inherent heterogeneity between experimental batches. In addition, the study integrates interpretability (using GradientSHAP-based relevance backpropagation) to probe what image features the network uses. Although the relevance maps did not reveal obvious human-interpretable features, the attempt reflects a commendable thoroughness in analysis. Overall, the experimental design, data analysis, and reporting are of high quality, supporting the credibility of the conclusions.

      Response: We thank the reviewer for their very positive and detailed assessment. We appreciate the recognition of our efforts to ensure methodological rigor and reproducibility, and we agree that interpretability remains an important but challenging area for future work.

      Reviewer #3 (Significance (Required)):

      Scientific Significance and Conceptual Advances:

      Biologically, the ability to predict organoid outcomes early is quite significant. It means researchers can potentially identify when and which organoids will form a given tissue, allowing them to harvest samples at the right moment for molecular assays or to exclude organoids that will not form the desired structure. The manuscript's results indicate that RPE and lens fate decisions in retinal organoids are made much earlier than visible differentiation, with predictive signals detectable as early as ~11 hours for RPE and ~4-5 hours for lens. This suggests a surprising synchronization or early commitment in organoid development that was not previously appreciated. The authors' introduction of deep learning-derived determination windows refines the concept of a developmental "point of no return" for cell fate in organoids. Focusing on these windows could help in pinpointing the molecular triggers of these fate decisions. Another conceptual advance is demonstrating that non-invasive imaging data can serve a predictive role akin to (or better than) destructive molecular assays. The study highlights that classical morphology metrics and even expert eyes capture mainly recognition of emerging tissues, whereas the CNN detects subtler, non-intuitive features predictive of future development. This underlines the power of deep learning to uncover complex phenotypic patterns that elude human analysis, a concept that could be extended to other organoid systems and developmental biology contexts. In sum, the work not only provides a tool for prediction but also contributes conceptual insights into the timing of cell fate determination in organoids.

      Response: We thank the reviewer for this thoughtful and positive assessment. We agree that the determination windows provide a valuable framework to study early fate decisions in organoids, and we have emphasized this point in the discussion to highlight the biological significance of our findings.

      Strengths:

      The combination of high-resolution time-lapse imaging with advanced deep learning is innovative. The authors effectively leverage AI to solve a biological uncertainty problem, moving beyond qualitative observations to quantitative predictions. The study uses a remarkably large dataset (1,000 organoids, >100k images), which is a strength as it captures variability and provides robust training data. This scale lends confidence that the model isn't overfit to a small sample. By comparing deep learning with classical machine learning and human predictions, the authors provide context for the model's performance. The CNN ensemble consistently outperforms both the classical algorithms and human experts, highlighting the value added by the new method. The deep learning model achieves high accuracy (F1 > 0.85) at impressively early time points. The fact that it can predict lens formation just ~4.5 hours into development with confidence is striking. Performance remained strong and exceeded human capability at all assessed times. Key experimental and analytical steps (segmentation, cross-validation between experiments, model calibration, use of appropriate metrics) are executed carefully. The manuscript is transparent about training procedures and even provides source code references, enhancing reproducibility. The manuscript is generally well-written with a logical flow from the problem (organoid heterogeneity) to the solution (predictive modeling) and clear figures referenced.

      Response: We thank the reviewer for this very positive and encouraging assessment of our study, particularly regarding the scale of our dataset, the methodological rigor, and the reproducibility of our approach.

      Weaknesses and Limitations:

      Generalizability Across Batches/Conditions: One limitation is the variability in model performance on organoids from independent experiments. The CNN did slightly worse on a validation set from a separate experiment, indicating that differences in the experimental batch (e.g., slight protocol or environmental variations) can affect accuracy. This raises the question of how well the model would generalize to organoids generated under different protocols or by other labs. While the authors do employ an experiment-wise cross-validation, true external validation (on a totally independent dataset or a different organoid system) would further strengthen the claim of general applicability.

      Response: We thank the reviewer for this important point. We agree that generalizability across batches and experimental conditions is a key consideration. We have carefully revised the discussion to explicitly address this limitation and to highlight the variability observed between independent experiments.

      Interpretability of the Predictions: Despite using relevance backpropagation, the authors were unable to pinpoint clear human-interpretable image features that drive the predictions. In other words, the deep learning model remains somewhat of a "black box" in terms of what subtle cues it uses at early time points. This limits the biological insight that can be directly extracted regarding early morphological indicators of RPE or lens fate. It would be ideal if the study could highlight specific morphological differences (even if minor) correlated with fate outcomes, but currently those remain elusive.

      Response: We thank the reviewer for raising this important point. Indeed, while our models achieved robust predictive performance, the underlying morphological cues remained difficult to interpret using relevance backpropagation. We believe this limitation reflects both the subtlety of the early predictive signals and the complexity of the features captured by deep learning models, which may not correspond to human-intuitive descriptors. We have clarified this limitation in the Discussion and Supplementary Note 1 and emphasize that further methodological advances in interpretability, or integration with complementary molecular readouts, will be essential to uncover the precise morphological correlates of fate determination.

      Scope of Outcomes: The study focuses on two particular tissues (RPE and lens) as the outcomes of interest. These were well-chosen as examples (one induced, one spontaneous), but they do not encompass the full range of retinal organoid fates (e.g., neural retina layers). It's not a flaw per se, but it means the platform as presented is specialized. The method might need adaptation to predict more complex or multiple tissue outcomes simultaneously.

      Response: We agree with the reviewer that our study focuses on two specific tissues, RPE and lens, which served as proof-of-concept outcomes representing both induced and spontaneous differentiation events. While this scope is necessarily limited, we believe it demonstrates the general feasibility of our approach. We have clarified in the Discussion that the same framework could, in principle, be extended to additional retinal fates such as neural retina layers, or even to multi-label prediction tasks, provided appropriate annotations are available. We now provide additional experiments showing that even abstract morphological classes are well predictable. This will be an important next step to broaden the applicability of our platform.

      Requirement of Large Data and Annotations: Practically, the approach required a very large imaging dataset and extensive manual annotation; each organoid's RPE and lens outcome, plus manual masking for training the segmentation model. This is a substantial effort that may be challenging to reproduce widely. The authors suggest that perhaps ~500 organoids might suffice to achieve similar results, but the data requirement is still high. Smaller labs or studies with fewer organoids might not immediately reap the full benefits of this approach without access to such imaging throughput.

      Response: We thank the reviewer for highlighting this important point. We agree that the generation of a large imaging dataset and the associated annotations represent a substantial investment of time and resources. At the same time, we consider this effort highly relevant, as it reflects the intrinsic heterogeneity of organoid systems rather than technical artifacts, and therefore ensures robust model training. We have clarified this limitation in the discussion. While our full dataset included ~1,000 organoids, our downsampling analysis suggests that as few as ~500 organoids may already be sufficient to reproduce the key findings, which we believe makes the approach feasible for many organoid systems (compare new Supplementary Note 1). Moreover, as we outline in the Discussion, future refinements such as combining image- and tabular-based features or incorporating fluorescence data could further enhance predictive power and reduce annotation effort.

      Medaka Fish vs. Other Systems: The retinal organoids in this study appear to be from medaka fish, whereas much organoid research uses human iPSC-derived organoids. It's not fully clear in the manuscript as to how the findings translate to mammalian or human organoids. If there are species-specific differences, the applicability to human retinal organoids (which are important for disease modeling) might need discussion. This is a minor point if the biology is conserved, but worth noting as a potential limitation.

      Response: We thank the reviewer for pointing out this important consideration. We have now explicitly clarified in the Discussion that our proof-of-concept study was performed in medaka organoids, which offer high reproducibility and rapid development. While species-specific differences may exist, the predictive framework is not inherently restricted to medaka and should, in principle, be transferable to mammalian or human iPSC/ESC-derived organoids, provided sufficiently annotated datasets are available. We have amended the Discussion accordingly.

      Predicting Tissue Size is Harder: The model's accuracy in predicting how much tissue (relative area) an organoid will form, while good, is notably lower than for simply predicting presence/absence. Final F1 scores for size classes (~0.7) indicate moderate success. This implies that quantitatively predicting organoid phenotypic severity or extent is more challenging, perhaps due to more continuous variation in size. The authors do acknowledge the lower accuracy for size and treat it carefully.

      Response: We thank the reviewer for this observation and agree with their interpretation. We have already acknowledged in the manuscript that predicting tissue size is more challenging than predicting tissue presence/absence, and we believe we have treated these results with appropriate caution in the revised version of the manuscript.

      Latency vs. Determination: While the authors narrow down the time window of fate determination, it remains somewhat unclear whether the times at which the model reaches high confidence truly correspond to the biological "decision point" or are just the earliest detection of its consequences. The manuscript discusses this caveat, but it's an inherent limitation that the predictive time point might lag the actual internal commitment event. Further work might be needed to link these predictions to molecular events of commitment.

      Response: We agree with the reviewer. As noted in the Discussion, the time points identified by our models likely reflect the earliest detectable morphological consequences of fate determination, rather than the exact molecular commitment events themselves. Establishing a direct link between predictive signals and underlying molecular mechanisms will require future experimental work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Afting et. al. addresses the challenge of heterogeneity in retinal organoid development by using deep learning to predict eventual tissue outcomes from early-stage images. The central hypothesis is that deep learning can forecast which tissues an organoid will form (specifically retinal pigmented epithelium, RPE, and lens) well before those tissues become visibly apparent. To test this, the authors assembled a large-scale time-lapse imaging dataset of ~1,000 retinal organoids (~100,000 images) with expert annotations of tissue outcomes. They characterized the variability in organoid morphology and tissue formation over time, focusing on two tissues: RPE (which requires induction) and lens (which appears spontaneously). The core finding is that a deep learning model can accurately predict the emergence and size of RPE and lens in individual organoids at very early developmental stages. Notably, a convolutional neural network (CNN) ensemble achieved high predictive performance (F1-scores ~0.85-0.9) hours before the tissues were visible, significantly outperforming human experts and classical image-analysis-based classifiers. This approach effectively bypasses the issue of stochastic developmental heterogeneity and defines an early "determination window" for fate decisions. Overall, the study demonstrates a proof-of-concept that artificial intelligence can forecast organoid differentiation outcomes non-invasively, which could revolutionize how organoid experiments are analyzed and interpreted.

      Recommendation:

      While this manuscript addresses an important and timely scientific question using innovative deep learning methodologies, it currently cannot be recommended for acceptance in its present form. The authors must thoroughly address several critical limitations highlighted in this report. In particular, significant issues remain regarding the generalizability of the predictive models across different experimental conditions, the interpretability of deep learning predictions, and the use of Euclidean distance metrics in high-dimensional morphometric spaces-potentially leading to distorted interpretations of organoid heterogeneity. These revisions are essential for validating the general applicability of their approach and enhancing biological interpretability. After thoroughly addressing these concerns, the manuscript may become suitable for future consideration.

      Major Issues (with Suggestions):

      1. Generalization to Other Batches or Protocols: The drop in performance on independent validation experiments suggests the model may partially overfit to specific experimental conditions. A major concern is how well this approach would work on organoids from a different batch or produced by a slightly different differentiation protocol. Suggestion: The authors should clarify the extent of variability between their "independent experiment" and training data (e.g., were these done months apart, with different cell lines or minor protocol tweaks?). To strengthen confidence in the model's robustness, I recommend testing the trained model on one or more truly external datasets, if available (for instance, organoids generated in a separate lab or under a modified protocol). Even a modest analysis showing the model can be adapted (via transfer learning or re-training) to another dataset would be valuable. If new data cannot be added, the authors should explicitly discuss this limitation and perhaps propose strategies (like domain adaptation techniques or more robust training with diverse conditions) to handle batch effects in future applications.
      2. Biological Interpretation of Early Predictive Features: The study currently concludes that the CNN picks up on complex, non-intuitive features that neither human experts nor conventional analysis could identify. However, from a biological perspective, it would be highly insightful to know what these features are (e.g., subtle texture, cell distribution patterns, etc.). Suggestion: I encourage the authors to delve deeper into interpretability. They might try complementary explainability techniques (for example, occlusion tests where parts of the image are masked to see if predictions change, or activation visualization to see what patterns neurons detect) beyond GradientSHAP. Additionally, analyzing false predictions might provide clues: if the model is confident but wrong for certain organoids, what visual traits did those have? If possible, correlating the model's prediction confidence with measured morphometrics or known markers (if any early marker data exist) could hint at what the network sees. Even if definitive features remain unidentified, providing the reader with any hypothesis (for instance, "the network may be sensing a subtle rim of pigmentation or differences in tissue opacity") would add value. This would connect the AI predictions back to biology more strongly.
      3. Expansion to Other Outcomes or Multi-Outcome Prediction: The focus on RPE and lens is well-justified, but these are two outcomes within retinal organoids. A major question is whether the approach could be extended to predict other cell types or structures (e.g., presence of certain retinal neurons, or malformations) or even multiple outcomes at once. Suggestion: The authors should discuss the generality of their approach. Could the same pipeline be trained to predict, say, photoreceptor layer formation or other features if annotated? Are there limitations (like needing binary outcomes vs. multi-class)? Even if outside the scope of this study, a brief discussion would reassure readers that the method is not intrinsically limited to these two tissues. If data were available, it would be interesting to see a multi-label classification (predict both RPE and lens presence simultaneously) or an extension to other organoid systems in future. Including such commentary would highlight the broad applicability of this platform.
      4. Curse of high dimensionality: Using Euclidean distance in a 165-dimensional morphometric space likely suffers from the curse of dimensionality, which diminishes the meaning of distances as dimensionality increases. In such high-dimensional settings, the range of pairwise distances tends to collapse, undermining the ability to discern meaningful intra- vs. inter-organoid differences. Suggestion: To address this, I would encourage the authors to apply principal component analysis (PCA) in place of (or prior to) tSNE. PCA would reduce the data to a few dominant axes of variation that capture most of the morphometric variance, directly revealing which features drive differences between organoids. These principal components are linear combinations of the original 165 parameters, so one can examine their loadings to identify which morphometric traits carry the most information - yielding interpretable axes of biological variation (e.g., organoid size, shape complexity, etc.). In addition, I would like to mention an important cautionary remark regarding tSNE embeddings. tSNE does not preserve global geometry of the data. Distances and cluster separations in a tSNE map are therefore not faithful to the original high-dimensional distances and should be interpreted with caution. See Chari T, Pachter L (2023), The specious art of single-cell genomics, PLoS Comput Biol 19(8): e1011288, for an enlightening discussion in the context of single cell genomics. The authors have shown that extreme dimensionality reduction to 2D can introduce significant distortions in the data's structure, meaning the apparent proximity or separation of points in a tSNE plot may be an artifact of the algorithm rather than a true reflection of morphometric similarity. Implementing PCA would mitigate high-dimensional distance issues by focusing on the most informative dimensions, while also providing clear, quantitative axes that summarize organoid heterogeneity. This change would strengthen the analysis by making the results more robust (avoiding distance artifacts) and biologically interpretable, as each principal component can be traced back to specific morphometric features of interest.
      5. Statistical Reporting and Significance: The manuscript focuses on F1-score as the metric to report accuracy over time, which is appropriate. However, it's not explicitly stated whether any statistical significance tests were performed on the differences between methods (e.g., CNN vs human, CNN vs classical ML). Suggestion: The authors could report statistical significance of the performance differences, perhaps using a permutation test or McNemar's test on predictions. For example, is the improvement of the CNN ensemble over the Random Forest/QDA classifier statistically significant across experiments? Given the n of organoids, this should be assessable. Demonstrating significance would add rigor to the analysis.

      Minor Issues (with Suggestions):

      1. Data Availability: Given the resource-intensive nature of the work, the value to the community will be highest if the data is made publicly available. I understand that this is of course at the behest of the authors and they do mention that they will make the data available upon publication of the manuscript . For the time being, the authors can consider sharing at least a representative subset of the data or the trained model weights. This will allow others to build on their work and test the method in other contexts, amplifying the impact of the study.
      2. Discussion - Future Directions: The Discussion does a good job of highlighting applications (like guiding molecular analysis). One minor addition could be speculation on using this approach to actively intervene: for example, could one imagine altering culture conditions mid-course for organoids predicted not to form RPE, to see if their fate can be changed? The authors touch on reducing variability by focusing on the window of determination; extending that thought to an experimental test (though not done here) would inspire readers. This is entirely optional, but a sentence or two envisioning how predictive models enable dynamic experimental designs (not just passive prediction) would be a forward-looking note to end on.

      I believe with the above clarifications and enhancements - especially regarding generalizability and interpretability - the paper will be suitable for broad readership. The work represents an exciting intersection of developmental biology and AI, and I commend the authors for this contribution.

      Novelty and Impact:

      This work fills an important gap in organoid biology and imaging. Previous studies have used deep learning to link imaging with molecular profiles or spatial patterns in organoids, but there remained a "notable gap" in predicting whether and to what extent specific tissues will form in organoids. The authors' approach is novel in applying deep learning to prospectively predict organoid tissue outcomes (RPE and lens) on a per-organoid basis, something not previously demonstrated in retinal organoids. Conceptually, this is a significant advance: it shows that fate decisions in a complex 3D culture model can be predicted well in advance, suggesting the existence of subtle early morphogenetic cues that only a sophisticated model can discern. The findings will be of broad interest to researchers in organoid technology, developmental biology, and biomedical AI.

      Methodological Rigor and Technical Quality:

      The study is methodologically solid and carefully executed. The authors gathered a uniquely large dataset under consistent conditions, which lends statistical power to their analyses. They employ rigorous controls: an expert panel provided human predictions as a baseline, and a classical machine learning pipeline using quantitative image-derived features was implemented for comparison. The deep learning approach is well-chosen and technically sound. They use an ensemble of CNN architectures (DenseNet121, ResNet50, and MobileNetV3) pre-trained on large image databases, fine-tuning them on organoid images. The use of image segmentation (DeepLabV3) to isolate the organoid from background is appropriate to ensure the models focus on the relevant morphology. Model training procedures (data augmentation, cross-entropy loss with class balancing, learning rate scheduling, and cross-validation) are thorough and follow best practices. The evaluation metrics (primarily F1-score) are suitable for the imbalanced outcomes and emphasize prediction accuracy in a biologically relevant way. Importantly, the authors separate training, test, and validation sets in a meaningful manner: images of each organoid are grouped to avoid information leakage, and an independent experiment serves as a validation to test generalization. The observation that performance is slightly lower on independent validation experiments underscores both the realism of their evaluation and the inherent heterogeneity between experimental batches. In addition, the study integrates interpretability (using GradientSHAP-based relevance backpropagation) to probe what image features the network uses. Although the relevance maps did not reveal obvious human-interpretable features, the attempt reflects a commendable thoroughness in analysis. Overall, the experimental design, data analysis, and reporting are of high quality, supporting the credibility of the conclusions.

      Significance

      Scientific Significance and Conceptual Advances:

      Biologically, the ability to predict organoid outcomes early is quite significant. It means researchers can potentially identify when and which organoids will form a given tissue, allowing them to harvest samples at the right moment for molecular assays or to exclude organoids that will not form the desired structure. The manuscript's results indicate that RPE and lens fate decisions in retinal organoids are made much earlier than visible differentiation, with predictive signals detectable as early as ~11 hours for RPE and ~4-5 hours for lens. This suggests a surprising synchronization or early commitment in organoid development that was not previously appreciated. The authors' introduction of deep learning-derived determination windows refines the concept of a developmental "point of no return" for cell fate in organoids. Focusing on these windows could help in pinpointing the molecular triggers of these fate decisions. Another conceptual advance is demonstrating that non-invasive imaging data can serve a predictive role akin to (or better than) destructive molecular assays. The study highlights that classical morphology metrics and even expert eyes capture mainly recognition of emerging tissues, whereas the CNN detects subtler, non-intuitive features predictive of future development. This underlines the power of deep learning to uncover complex phenotypic patterns that elude human analysis, a concept that could be extended to other organoid systems and developmental biology contexts. In sum, the work not only provides a tool for prediction but also contributes conceptual insights into the timing of cell fate determination in organoids.

      Strengths:

      The combination of high-resolution time-lapse imaging with advanced deep learning is innovative. The authors effectively leverage AI to solve a biological uncertainty problem, moving beyond qualitative observations to quantitative predictions. The study uses a remarkably large dataset (1,000 organoids, >100k images), which is a strength as it captures variability and provides robust training data. This scale lends confidence that the model isn't overfit to a small sample. By comparing deep learning with classical machine learning and human predictions, the authors provide context for the model's performance. The CNN ensemble consistently outperforms both the classical algorithms and human experts, highlighting the value added by the new method. The deep learning model achieves high accuracy (F1 > 0.85) at impressively early time points. The fact that it can predict lens formation just ~4.5 hours into development with confidence is striking. Performance remained strong and exceeded human capability at all assessed times. Key experimental and analytical steps (segmentation, cross-validation between experiments, model calibration, use of appropriate metrics) are executed carefully. The manuscript is transparent about training procedures and even provides source code references, enhancing reproducibility. The manuscript is generally well-written with a logical flow from the problem (organoid heterogeneity) to the solution (predictive modeling) and clear figures referenced.

      Weaknesses and Limitations:

      Generalizability Across Batches/Conditions: One limitation is the variability in model performance on organoids from independent experiments. The CNN did slightly worse on a validation set from a separate experiment, indicating that differences in the experimental batch (e.g., slight protocol or environmental variations) can affect accuracy. This raises the question of how well the model would generalize to organoids generated under different protocols or by other labs. While the authors do employ an experiment-wise cross-validation, true external validation (on a totally independent dataset or a different organoid system) would further strengthen the claim of general applicability.

      Interpretability of the Predictions: Despite using relevance backpropagation, the authors were unable to pinpoint clear human-interpretable image features that drive the predictions. In other words, the deep learning model remains somewhat of a "black box" in terms of what subtle cues it uses at early time points. This limits the biological insight that can be directly extracted regarding early morphological indicators of RPE or lens fate. It would be ideal if the study could highlight specific morphological differences (even if minor) correlated with fate outcomes, but currently those remain elusive.

      Scope of Outcomes: The study focuses on two particular tissues (RPE and lens) as the outcomes of interest. These were well-chosen as examples (one induced, one spontaneous), but they do not encompass the full range of retinal organoid fates (e.g., neural retina layers). It's not a flaw per se, but it means the platform as presented is specialized. The method might need adaptation to predict more complex or multiple tissue outcomes simultaneously.

      Requirement of Large Data and Annotations: Practically, the approach required a very large imaging dataset and extensive manual annotation; each organoid's RPE and lens outcome, plus manual masking for training the segmentation model. This is a substantial effort that may be challenging to reproduce widely. The authors suggest that perhaps ~500 organoids might suffice to achieve similar results, but the data requirement is still high. Smaller labs or studies with fewer organoids might not immediately reap the full benefits of this approach without access to such imaging throughput.

      Medaka Fish vs. Other Systems: The retinal organoids in this study appear to be from medaka fish, whereas much organoid research uses human iPSC-derived organoids. It's not fully clear in the manuscript as to how the findings translate to mammalian or human organoids. If there are species-specific differences, the applicability to human retinal organoids (which are important for disease modeling) might need discussion. This is a minor point if the biology is conserved, but worth noting as a potential limitation.

      Predicting Tissue Size is Harder: The model's accuracy in predicting how much tissue (relative area) an organoid will form, while good, is notably lower than for simply predicting presence/absence. Final F1 scores for size classes (~0.7) indicate moderate success. This implies that quantitatively predicting organoid phenotypic severity or extent is more challenging, perhaps due to more continuous variation in size. The authors do acknowledge the lower accuracy for size and treat it carefully.

      Latency vs. Determination: While the authors narrow down the time window of fate determination, it remains somewhat unclear whether the times at which the model reaches high confidence truly correspond to the biological "decision point" or are just the earliest detection of its consequences. The manuscript discusses this caveat, but it's an inherent limitation that the predictive time point might lag the actual internal commitment event. Further work might be needed to link these predictions to molecular events of commitment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.
      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.
      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?
      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Are the data and the methods presented in such a way that they can be reproduced?

      No, the code is not currently available. We were not able to review the source code.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.
      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?
      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.
      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating? Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.
      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.
      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.
      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?
      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?
      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?
      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.
      • The work uses standard convolutional neural networks.

      State what audience might be interested in and influenced by the reported findings.

      • Data scientists performing image-based profiling for time lapse imaging of organoids.
      • Retinal organoid biologists
      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      • Image-based profiling/morphometrics
      • Organoid image analysis
      • Computational biology
      • Cell biology
      • Data science/machine learning
      • Software

      This is a signed review: Gregory P. Way, PhD Erik Serrano Jenna Tomkinson Michael J. Lippincott Cameron Mattson Department of Biomedical Informatics, University of Colorado

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study presents predictive modeling for developmental outcome in retinal organoids based on high-content imaging. Specifically, it compares the predictive performance of an ensemble of deep learning models with classical machine learning based on morphometric image features and predictions from human experts for four different task: prediction of RPE presence and lense presence (at the end of development) as well as the respective sizes. It finds that the DL model outperforms the other approaches and is predictive from early timepoints on, strongly indicating a time-frame for important decision steps in the developmental trajectory.

      Major comments: I find the paper over-all well written and easy to understand. The findings are relevant (see significance statement for details) and well supported. However, I have some remarks on the description and details of the experimental set-up, the data availability and reproducibility / re-usability of the data.

      1. Some details about the experimental set-up are unclear to me. In particular, it seems like there is a single organoid per well, as the manuscript does not mention any need for instance segmentation or tracking to distinguish organoids in the images and associate them over time. Is that correct? If yes, it should be explicitly stated so. Are there any specific steps in the organoid preparation necessary to avoid multiple organoids per well? Having multiple organoids per well would require the aforementioned image analysis steps (instance segmentation and tracking) and potentially add significant complexity to the analysis procedure, so this information is important to estimate the effort for setting up a similar approach in other organoid cultures (for example cancer organoids, where multiple organoids per well are common / may not be preventable in certain experimental settings).
      2. The terminology used with respect to the test and validation set is contrary to the field, and reporting the results on the test set (should be called validation set), should be avoided since it is used to select models. In more detail: the terms "test set" and "validation set" (introduced in 213-221) are used with the opposite meaning to their typical use in the deep learning literature. Typically, the validation set refers to a separate split that is used to monitor convergence / avoid overfitting during training, and the test set refers to an external set that is used to evaluate the performance of trained models. The study uses these terms in an opposite manner, which becomes apparent from line 624: "best performing model ... judged by the loss of the test set.". Please exchange this terminology, it is confusing to a machine learning domain expert. Furthermore, the performance on the test set (should be called validation set) is typically not reported in graphs, as this data was used for model selection, and thus does not provide an unbiased estimate of model performance. I would remove the respective curves from Figures 3 and 4.
      3. The experimental set-up for the human expert baseline is quite different to the evaluation of the machine learning models. The former is based on the annotation of 4,000 images by seven expert, the latter based on a cross-validation experiments on a larger dataset. First of all, the details on the human expert labeling procedure is very sparse, I could only find a very short description in the paragraph 136-144, but did not find any further details in the methods section. Please add a methods section paragraph that explains in more detail how the images were chosen, how they were assigned to annotators, and if there was any redundancy in annotation, and if yes how this was resolved / evaluated. Second, the fact that the set-up for human experts and ML models is quite different means that these values are not quite comparable in a statistical sense. Ideally, human estimators would follow the same set-up as in ML (as in, evaluate the same test sets). However, this would likely prohibitive in the required effort, so I think it's enough to state this fact clearly, for example by adding a comment on this to the captions of Figure 3 and 4.
      4. It is unclear to me where the theoretical time window for the Latent Determination Horizon in Figure 5 (also mentioned in line 350) comes from? Please explain this in more detail and provide a citation for it.
      5. The intepretability analysis (Figure 4, 634-639) based on relevance backpropagation was performed based on DenseNet121 only. Why did you choose this model and not the ResNet / MobileNet? I think it is quite crucial to see if there are any differences between these model, as this would show how much weight can be put on the evidence from this analysis and I would suggest to add an additional experiment and supplementary figure on this.
      6. The code referenced in the code availability statement is not yet present. Please make it available and ensure a good documentation for reproducibility. Similarly, it is unclear to me what is meant by "The data that supports the findings will be made available on HeiDoc". Does this only refer to the intermediate results used for statistical analysis? I would also recommend to make the image data of this study available. This could for example be done through a dedicated data deposition service such as BioImageArchive or BioStudies, or with less effort via zenodo. This would ensure both reproducibility as well as potential re-use of the data. I think the latter point is quite interesting in this context; as the authors state themselves it is unclear if prediction of the TOIs isn't even possible at an earlier point that could be achieved through model advances, which could be studied by making this data available.

      Minor comments:

      Line 315: Please add a citation for relevance backpropagation here.

      Line 591: There seems to be typo: "[...] classification of binary classification [...]"

      Line 608: "[...] where the images of individual organoids served as groups [...]" It is unclear to me what this means.

      Significance

      General assessment: This study demonstrates that (retinal) organoid development can be predicted from early timepoints with deep learning, where these cannot be discerned by human experts or simpler machine learning models. This fact is very interesting in itself due to its implication for organoid development, and could provide a valuable tool for molecular analysis of different organoid populations, as outlined by the authors. The contribution could be strengthened by providing a more thorough investigation of what features in the image are predictive at early timepoints, using a more sophisticated approach than relevance backprop, e.g. Discover (https://www.nature.com/articles/s41467-024-51136-9). This could provide further biological insight into the underlying developmental processes and enhance the understanding of retinal organoid development.

      Advance: similar studies that predict developmental outcome based on image data, for example cell proliferation or developmental outcome exist. However, to the best of my knowledge, this study is the first to apply such a methodology to organoids and convincingly shows is efficacy and argues is potential practical benefits. It thus constitutes a solid technical advance, that could be especially impactful if it could be translated to other organoid systems in the future.

      Audience: This research is of interest to a technical audience. It will be of immediate interest to researchers working on retinal organoids, who could adapt and use the proposed system to support experiments by better distinguishing organoids during development. To enable this application, code and data availability should be ensured (see above comments on reproducibility). It is also of interest to researchers in other organoid systems, who may be able to adapt the methodology to different developmental outcome predictions. Finally, it may also be of interest to image analysis / deep learning researchers as a dataset to improve architectures for predictive time series modeling.

      My research background: I am an expert in computer vision and deep learning for biomedical imaging, especially in microscopy. I have some experience developing image analysis for (cancer) organoids. I don't have any experience on the wet lab side of this work.

      Constantin Pape

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02830

      Corresponding author(s): Julien, Sage

      1. General Statements

      We thank the Reviewers for a fair review of our work and helpful suggestions. We have significantly revised the manuscript in response to these suggestions. We provide a point-by-point response to the Reviewers below but wanted to highlight in our response a recurring concern related to the strong cell cycle arrest observed upon the acute FAM53C knock-down being different than the limited phenotypes in other contexts, including the knockout mice and DepMap data.

      First, we now show that we can recapitulate the strong G1 arrest resulting from the FAM53C knock-down using two independent siRNAs in RPE-1 cells, supporting the specificity of the effects.

      Second, the G1 arrest that results from the FAM53C knock-down is also observed in cells with inactive p53, suggesting it is not due to a non-specific stress response due to “toxic” siRNAs. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype.

      Third, we have performed experiments in other human cells, including cancer cell lines. As would be expected for cancer cells, the G1 arrest is less pronounced but is still significant, indicating that the G1 arrest is not unique to RPE-1 cells.

      Fourth, it is not unexpected that compensatory mechanisms would be activated upon loss of FAM53C during development or in cancer – which may explain the lack of phenotypes in vivo or upon long-term knockout. This has been true for many cell cycle regulators, either because of compensation by other family members that have overlapping functions, or by a larger scale rewiring of signaling pathways.

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. In addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells. The Reviewer raises a great point. Our initial statement needed to be clarified and also need more experimental support. We have performed experiments where we knocked down FAM53C and p21 individually, as well as in combination, in RPE-1 cells. These experiment show that p21 knock-down is not sufficient to negate the cell cycle arrest resulting from the FAM53C knock-down in RPE-1 cells (Figure 4B,C and Figure S4C,D).

      We now extended these experiments to conditions where we inhibited DYRK1A, and we also compared these data to experiments in p53-null RPE-1 cells. Altogether, these experiments point to activation of p53 downstream of DYRK1A activation upon FAM53C knock-down, and indicate that p21 is not the only critical p53 target in the cell cycle arrest observed in FAM53C knock-down cells (Figure 4 and Figure S4).

      The authors do not convincingly show that FAM53C acts as a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.

      We appreciate these comments from the Reviewer and have significantly revised the manuscript to address them.

      The analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We removed previous panel 4B from the revised manuscript. For panels 4E and S4B (now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      The representative Western blot images for 5C-D (now 5F-G) in the original submission are shown in Figure 5E, we apologize if this was not clear. The differences are small, which we acknowledge in the revised manuscript. Note that several factors can affect Cyclin D levels in cells, including the growth rate and the stage of the cell cycle. Our FACS analysis shows that normal organoids have ~63% of cells in G1 and ~13% in S phase; the overall lower proportion of S-phase cells in organoids may make the immunoblot difference appear smaller, with fewer cycling cells resulting in decreased Cyclin D phosphorylation.

      Nevertheless, the Reviewer brings up a good point and comments from this Reviewer and the others made us re-think how to best interpret our results. As discussed above, we re-read carefully the Meyer paper and think that FAM53C’s role and DYRK1A activity in cells may be understood when considering levels of both CycD and p21 at the same time in a continuum. While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is likely that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?

      We repeated the experiments with the DYRK1A inhibitor and counted the cells. In p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells.

      The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.

      We apologize for these duplicated panels in the original submission. We now replaced the wrong panel with the correct data (Fig. 5F,G).

      Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      We agree with the Reviewer that, although we observed significant p-values, this original statement may not be appropriate in the biological sense. We made sure in the revised manuscript to carefully present these data.

      Minor comments:

      Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.

      To address this point, we updated Table S1 (2nd tab) to provide a better rationale for the 38 factors chosen. Our focus was on the canonical RB pathway and we included RB binding proteins whose function had suggested they may also be playing a role in the G1/S transition. We do agree that there is some bias in this selection (e.g., there are more RB binding factors described) but we hope the Reviewer will agree with us that this list and the subsequent analysis identified expected factors, including FAM53C. Future studies using this approach and others will certainly identify new regulators of cell cycle progression.

      Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.

      We agree with the Reviewer that this panel was not necessarily useful and possibly in the wrong place, and we removed it from the manuscript. We replaced it with a cartoon of top hits in the screen.

      The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.

      We re-graphed these panels.

      Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.

      We changed the header to “Consequences of FAM53C inactivation in human cortical organoids in culture”.

      Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?

      Thank you for your feedback. The subG1 population in the original Figure S5F represents cells that died during the dissociation step of the organoids for FACS analysis. To address this point, we performed live & dead staining to exclude dead cells and provide clearer data. We refined gating strategy for better clarity in the new S5F panel.

      Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      We fixed this mistake, thank you.

      __Reviewer #1 (Significance (Required)): __

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Summary

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points

      1. Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects. We thank the Reviewer for raising this important point. First, we need to clarify that our experiments were performed with a pool of siRNAs (not one siRNA). Second, commercial antibodies against FAM53C are not of the best quality and it has been challenging to detect FAM53C using these antibodies in our hands – the results are often variable. In addition, to better address the Reviewer’s point and control for the phenotypes we have observed, we performed two additional series of experiments: first, we have confirmed G1 arrest in RPE-1 cells with individual siRNAs, providing more confidence for the specificity of this arrest (Fig. S1B); second, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (Fig. S1E,F and Fig. 4F).

      Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.

      As mentioned above, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (three cancer cell lines) (Fig. S1E,F and Fig. 4F).

      Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?

      We revised the text of the manuscript to include the possibility that FAM53C could act as a competitive substrate and/or an inhibitor.

      We removed most of the Cyclin D phosphorylation/stability data from the revised manuscript. As the Reviewers pointed out, some of these data were statistically significant but the biological effects were small. As discussed above in our response to Reviewer #1, the analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We note, however, that we used specific Thr286 phospho-antibodies, which have been used extensively in the field. Our data in Figure 1 with palbociclib place FAM53C upstream of Cyclin D/CDK4,6. We performed Cyclin D overexpression experiments but RPE-1 cells did not tolerate high expression of Cyclin D1 (T286A mutant) and we have not been able to conduct more ‘genetic’ studies.

      At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. In the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?

      As discussed above, we removed some of these data and re-focused the manuscript on p53-p21 as a second pathway activated by loss of FAM53C.

      Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.

      This is an important point. We had cited an abstract from the company (Biosplice) but we agree that providing data is critical. We have now revised the manuscript with a new analysis of the compound’s specificity using kinase assays. These data are shown in Fig. S3F-H.

      A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      The Reviewer made a good point. As discussed in our response to Reviewer #1, with p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells. These data indicate that G1 entry by flow cytometry will not always translate into proliferation.

      Other points:

      Fig. 2C, 2D, 2E graphs should begin with 0

      We remade these graphs.

      Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text.

      We replaced the panel by the correct panel; we apologize for this error.

      Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate

      We agree and revised the text. We hope that the Reviewer will agree with us that it is worth showing these data, which are clearly preliminary but provide evidence of a possible role for FAM53C in the brain.

      Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C?

      We performed RNA sequencing of mouse embryonic fibroblasts derived from control and mutant mice. We clearly identified fewer reads in exon 4 in the knockout cells, and no other obvious change in the transcript (data not shown). However, immunoblot with mouse cells for FAM53C never worked well in our hands. We made sure to add this caveat to the revised manuscript.

      __Reviewer #2 (Significance (Required)): __

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.

      Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance.

      Strength of the paper:

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript.

      Critique:

      1) The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery.

      We thank the Reviewer for this comment. Please refer to the initial response to the three Reviewers, where we discuss our use of single siRNAs and our results in multiple cell lines. Briefly, we can recapitulate the G1 arrest upon FAM53C knock-down using two independent siRNAs in RPE-1 cells. We also observe the same G1 arrest in p53 knockout cells, suggesting it is not due to a non-specific stress response. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype. Human cancer cell lines also arrest in G1 upon FAM53C knock-down, not just RPE-1 cells. Finally, we hope the Reviewer will agree with us that compensatory mechanisms are very common in the cell cycle – which may explain the lack of phenotypes in vivo or upon long-term knockout of FAM53C.

      2) The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative.

      We now show data with three cancer cell lines (U2OS, A549, and HCT-116 – Fig. S1E,F and Fig. 4F), in addition to our results in RPE-1 cells and in human cortical organoids. We note that the knock-down experiments are complemented by overexpression data (Fig. 1G-I), by genetic data (our original DepMap screen), and our biochemical data (showing direct binding of FAM53C to DYRK1A).

      3) The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels?

      For several of our panels (e.g., 4E and S4B, now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      Data in 4A are also not a western blot but a radiograph.

      For immunoblots, we will provide all the source data with uncropped blots with the final submission.

      4) A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from BrdU incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the BrdU scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy.

      We apologize for the confusion and we fixed these errors, for most of the analyses, we used PI to measure G1 and S-phase entry. We added relevant flow cytometry plots to supplemental figures (Fig. S1G, H, I, as well as Fig. S4E and S4K, and Fig. S5F).

      5) There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed.

      This comment and comments from the two other Reviewers made us reconsider our model. We re-read carefully the Meyer paper and think that DYRK1A activity may be understood when considering levels of both CycD and p21 at the same time in a continuum (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is obvious that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      __Reviewer #3 (Significance (Required)): __

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is non-essential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rule out experimental artefacts that misguide the interpretation of the results.

      We appreciate this comment and hope that the Reviewer will agree it is still important to share our data with the field, even if the phenotypes in mice are modest.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.

      Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance.

      Strength of the paper:

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript.

      Critique:

      1. The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery.
      2. The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative.
      3. The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels?
      4. A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from Brad incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the Brad scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy.
      5. There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed.

      Significance

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is non-essential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rukle out experimental artefacts that misguide the interpretation of the results.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points

      1. Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects.
      2. Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.
      3. Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?
      4. At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. IN the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?
      5. Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.
      6. A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      Other points

      1. Fig. 2C, 2D, 2E graphs should begin with 0
      2. Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text.
      3. Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate
      4. Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C?

      Significance

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. IN addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells.
      2. The authors do not convincingly show that FAM53C acts a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.
      3. The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?
      4. The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.
      5. Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      Minor comments:

      1. Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.
      2. Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.
      3. The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.
      4. Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.
      5. Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?
      6. Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      Significance

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Since we are at the stage of simply proposing a Revision Plan to an affiliate journal, there is not a revised version of the manuscript yet. But we honestly thank the three reviewers for their important input, which we are taken into consideration very seriously.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Major Comments:

      It is interesting case study but the main problem with the study is the use of an unsuitable tardigrade model species. It was shown in the past that Hypsibius exemplaris is not a good model species to test tardigrade survival under extreme stress. Of course, results of Hypsibius exemplaris can be published but from the entire manuscript all general comments that tardigrades react in this or in different way need to be removed. This is characteristic only to Hypsibius exemplaris species which is a poor model for studies focused on environmental stressTo present general conclusions use few different tardigrade species or at least a correct tardigrade species with confirmed high resilience for different kind of stress like Milnesium, Ramazzottius, Paramacrobiotus or similar must be tested. Based on present study I can only propose to publish this manuscript as a case study for one poorly stress resistant eutardigrade species, without any general conclusions about other tardigrades. See: Poprawa, I., Bartylak, T., Kulpla, A., Erdmann, W., Roszkowska, M., Chajec, Ł., Kaczmarek, Ł., Karachitos, A. & Kmita, H. (2022) Verification of Hypsibius exemplaris Gąsiorek et al., 2018 (Eutardigrada; Hypsibiidae) application in anhydrobiosis research. PLoS ONE 17(3): e0261485.

      Minor comments:

      1. General comment to entire manuscript. Please do not start sentences with abbreviations, i.e. The DNA instead of DNA, Caenorhabditis instead of C. etc. In bibliography many doin numbers for publications are lacking, you have a different styles of citations, do not use capital letters for words inside the article title e.g. "Tardigrades as a Potential Model Organism in Space Research.", change it to "Tardigrades as a potential model organism in space research." Or use capital letters in all citations. Use italics for Latin names of the species and genera. On figures please try to put all of them like this that specimens ill be situated horizontally and in the middle of figure.
      2. Introduction, Lines 80-96: I do not understand why this section is in Introduction. This is description of the results of the studies could be minimal and details could be moved to proper chapters.
      3. Results: In this section are mixed results with methods. Please put all parts to the correct chapters.
      4. Line 227 and 235: Based on what you interpreted: "fully-grown adults" and "juveniles" that they were adult and fully grown? Please explain in the text.
      5. Line 315: You wrote "These findings demonstrate that even a transient exposure to zeocin causes irreversible DNA damage, leading to delayed mortality." but not to all specimens as you marked above.
      6. Line 461-462: You wrote: "In this study, we probed why tardigrades-despite their impressive DNA repair capacity and extremotolerance-still succumb to genotoxic stress." But only one tardigrade species with poor resilience to stress conditions has been tested in this study. What if more repair mechanisms are activated in tardigrades when tardigrades leaving the state of anhydrobiosis? Authors tested only active animals and in such mechanisms maybe not activated or are activated on lower level. What is even more problematic, and what I marked this in one of the first comments, the species used in study is incorrect because is not very resilient to extreme conditions. This species is also a poor anhydrobiotic species with almost zero ability to anhydrobiosis (during which repair mechanisms are activated).
      7. Line 609: "..actively searching for food.." - How you know that they were looking for food? What was a difference between normal crawling around and looking for food?
      8. Line 635: "In sum, tardigrades illustrate that..." - Only in case of Hypsibius. This is not characteristic for tardigrades. See my previous comments. This conclusion is too strong without adequate proof.
      9. Lines 666-667: "Adults measured {greater than or equal to}240 μm in length, while juveniles ranged between 120-180 μm." - Why such measurements? It was connected with something or is it arbitrary? Please explain.
      10. Lines: 673-677: "For each timepoint, fertility was calculated by dividing the total number of eggs laid by the number of live animals at that time (using the last recorded number of live animals when all animals had died). In Fig. 5A-B, fertility is presented as the mean cumulative number of eggs laid per animal over time; in Fig. S9, it is shown as the mean number of eggs laid per animal at each timepoint." - This method of calculating fertility may be valid only if you know that all the females laid the same number of eggs. It is obvious that some females produced less and some others more eggs. Hence, fertility can not be accurately calculated in this way.

      Significance

      Studies described in the manuscript are very interesting for many potential readers, however manuscript need to be modified as case study for one tardigrades species without generalization of the results for all tardigrades. It is very important to not suggest that all tardigrades react in the same way especially that species used is not a good candidate for this type of studies (see my major comments).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript studies the effects of genotoxic stress using zeocin, a bleomycin-family drug, in the tardigrade species H. exemplaris. In a first experimental set, the authors evaluate the survival of the organisms as well as the levels of DNA damage.

      A RT-qPCR analysis of a set of DNA repair genes identified in a previous study by another group (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol. 34, Issue 9, 1819-1830.e6) and a comet assay reveal the damage observed during treatment.

      Experiments on fasting animals show variations in animal size that overlap with those seen in groups of animals treated with the genotoxic drug. Physiological variations are also observed, such as lipid loss and cuticle alteration.

      In a subsequent experimental set, the authors indicate that the genotoxic drug blocks DNA replication and activates DNA repair systems in various tissues, particularly the digestive tissue, which appears to be specifically targeted in terms of its replicative capacity following DNA damage caused by the drug. A sensitivity study of tardigrade embryo development then shows that their proliferative capacity, which is highly dependent on replication, mobilizes different sets of DNA repair genes that may be more closely associated with replication than in adults.

      Finally, a comparative study of the development of two organisms (C. elegans and planarian) also shows sensitivity to drugs that disrupt the replication process during development.

      The authors conclude from all of this work that the cells of the animals' intestines are the main target of the genotoxic stress induced by the drug. The effects of disruption of the normal replication process in intestinal cells are thought to be the cause of the observed loss of tissue homeostasis (loss of lipids and tissue renewal capacity).

      Major comments:

      1. Zeocin is a drug derived from bleomycin but has not yet been extensively studied. Could you give examples of the use/validation of zeocin as a radiomimetic in other biological systems?

      2. Similarities in transcriptional responses between UV and dehydration genotoxic stresses have already been observed (Yoshida et al., 2022; BMC Genomics 23, 405) in a tardigrade species closely related to H. exemplaris (R. varieornatus). However, no correlation in transcriptional responses could be observed after treating H. exemplaris with genotoxic stresses such as desiccation and 500 Gy gamma ray irradiation (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6). These results indicate that, depending on the type of genotoxic stress, transcriptomic responses can appear to be very different and sometimes uncorrelated, particularly in the species H. exemplaris. Bleomycin has been studied in previous reports (refs Yoshida Y, et al. Proc Jpn Acad Ser B Phys Biol Sci. 2024 100(7):414-428; Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6; Marwan Anoud et al., 2024, eLife 13:RP92621), which used a transcriptomic study to confirm that it behaves as a radiomimetic for the species H. exemplaris.

      On the other hand, since zeocin is a bleomycin-family drug, it is possible that its effects may differ slightly from those of bleomycin, exhibiting specific effects as observed by comparison of chemical radiomimetic and radiation treatments.

      A control experiment comparing the effects of bleomycin and zeocin using RNAseq would validate that their use is equivalent.

      1. A major conclusion of the manuscript is that DNA damage induced by the genotoxic drug disrupts replication mechanisms and leads to the observed effects. Are RT-qPCR analyses on a subset of drug-induced repair genes induced solely by the drug itself or by its indirect effect on replication?

      It would be interesting to block replication in embryos and assess whether the same sets of DNA repair genes are induced when compared with treatment with zeocin only. Additionally, it will be interesting to redo the same DNA replication block experiments with additional treatment to compare the induced sets of DNA reparation genes. This will help to understand the true effect that will be directly imputable to zeocin.

      Minor comments:

      The data are well presented, and the experiments are well described for general understanding. Previous studies in this field have been well referenced. However, the link between DNA damage caused by the drug and its impact on replication needs to be better explained.

      Finally, the use of the drug zeocin should be validated in this system by comparison with bleomycin.

      Significance

      This study evaluates the resistance of a species of tardigrades to genotoxic stress. Several previous studies have conducted this type of experiment using the same species with consistent results and using the same type of genotoxic chemical drug : bleomycin. In this study, a new genotoxic drug is evaluated for its effects on DNA damage as well as on the survival of organisms and their embryonic development. Definitive validation experiments of this new genotoxic chemical tool are necessary to determine its similarities with drugs already known for their effects in the literature.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript concerns tardigrade sensitivity to genotoxic stress. Using the radiomimetic drug Zeocin to induce DNA breaks, authors show that continuous exposure progressively kills tardigrades, accompanied by striking body shrinkage and lipid depletion. Authors show that germ cells and embryos, with their high proliferation rates, show heightened sensitivity. To resume, their findings pinpoint DNA replication as an Achilles' heel of organismal survival under genotoxic stress.

      Major comments:

      The claims and conclusions in this article are not sufficiently supported by the data. They require additional experiments or analyses.

      The fundamental problem with this paper is the use of a single molecule, Zeocin, as a radiomimetic. It is absolutely essential to compare the results obtained with radiation. In the bibliography, researchers compare a drug with radiation. Bob Goldstein, for example, in his 2024 Current Biology paper uses radiation and bleomycin. The same is true for Concordet in his 2024 elife paper. Zeocin has been used very little on tardigrades. It cannot be used alone to draw conclusions from this study.

      Additionally, at the beginning of the paper, the authors tested different concentrations of Zeocin. They showed results at two concentrations : 100ug/ml and 1mg/ml. In the remainder of the paper, only the latter concentration is used. This is not sufficient. The analyses should have been conducted in parallel on several concentrations in order to compare and analyze a potential dose-dependent effect.

      Finally, the authors focused on two types of cells that have the particularity of replicating themselves: gut cells and storage cells. It would have been necessary to work on other cell types to compare the results.

      The realization of these additional experiences are completely realistic.

      The data and methods are presented in a reproducible manner. But experiments sometimes lack independent replicates and need to be reproduced.

      The legend to Figure 1, for example, indicates that the experiment was conducted with 3 to 7 biological replicates and 60 to 120 animals. These are still very different numbers. And this can lead to significant bias.

      For the other figures, no biological replicates were indicated and the numbers « n » are sometimes very different, as in Figure 4 with n=107 and n=166. A little more homogenization allows for better robustness of the results. And biological replicates are essential.

      Sometimes there are some unclear elements in the figures. In Figure 3, if I understand correctly, A and B show the gut cells (adult) and C and D the storage cells (juvenile). The size difference is not very clear in this image. How old is the juvenile compared to this adult?

      Significance

      This study, if confirmed by additional experiments that are absolutely essential to validate these conclusions, will be interesting for the community of researchers working on tardigrades, even if the effects of genotoxic stress on tardigrades are already widely studied.

      This study is relatively complete on only one molecule, Zeocin, at a concentration of 1 mg/ml. To be relevant, another genotoxic stress should be included in the study. And the study should also be conducted at the concentration of 100 ug/ml, which did show effects but was abandoned for the rest of the study. Similarly, only storage cells and gut cells were studied given their replication capacity. Other cell types should have been included in the study for comparison.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their valuable comments and criticisms. We have thoroughly revised the manuscript and the resource to address all the points raised by the reviewers. Below, we provide a point-by-point response for the sake of clarity.

      Reviewer #1

      __Evidence, reproducibility and clarity __

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments: - While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.

      We have expanded the introduction on the state-of-the-art of protein variant effects predictors, explaining how MAVISp departs from them.

      - The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each.

      We have added a concise, narrative description of the data flow for MAVISp, as well as improved the description of modules in the main text. We will integrate the results section with a more comprehensive description of the available modules, and then clarify in the case studies which modules were applied to achieve specific results.

      OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.

      We have added a supplementary table (Table S2) to guide the reader on the modules and workflows applied for each case study

      We also added Table S1 to map the toolkit used by MAVISp to collect the data that are imported and aggregated in the webserver for further guidance.

      - The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.

      We revised the usage of acronyms following the reviewer’s directions of defying them at first appearance.

      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.

      We thank the reviewer for noticing and praising the availability of the tools of MAVISp. Our MAVISp framework utilizes methods and scores that incorporate machine learning features (such as EVE or RaSP), but does not employ machine learning itself. Specifically, we do not use PyTorch and do not utilize features in a machine learning sense. We do extract some information from the AlphaFold2 models that we use (such as the pLDDT score and their secondary structure content, as calculated by DSSP), and those are available in the MAVISp aggregated csv files for each protein entry and detailed in the Documentation section of the MAVISp website.

      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      Minor comments: - Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.

      We have revised the introduction to accommodate the proper space for this comparison.

      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.

      We have revised Figure 2 and presented only one case study to simplify its readability. We have also changed Figure 3, whereas retained the other previous figures since they seemed less problematic.

      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified: Page 3, line 46: "MAVISp perform" -> "MAVISp performs" Page 3, line 56: "automatically as embedded" -> "automatically embedded" Page 3, line 57: "along with to enhance" -> unclear; please revise Page 4, line 96: "web app interfaces with the database and present" -> "presents" Page 6, line 210: "to investigate wheatear" -> "whether" Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify Page 15, line 446: "Both the approaches" -> "Both approaches" Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      We have done a proofreading of the entire article, including the points above

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance

      to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience

      this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

      Reviewer #2

      __Evidence, reproducibility and clarity __

      Summary: The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments: - On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.

      We would like to thank the reviewer for pointing out these inconsistencies. We have revised all the entries and corrected them. If needed, the history of the cases that have been corrected can be found in the closed issues of the GitHub repository that we use for communication between biocurators and data managers (https://github.com/ELELAB/mavisp_data_collection). We have also revised the protocol we follow in this regard and the MAVISp toolkit to include better support for isoform matching in our pipelines for future entries, as well as for the revision/monitoring of existing ones, as detailed in the Method Section. In particular, we introduced a tool, uniprot2refseq, which aids the biocurator in identifying the correct match in terms of sequence length and sequence identity between RefSeq and UniProt. More details are included in the Method Section of the paper. The two relevant scripts for this step are available at: https://github.com/ELELAB/mavisp_accessory_tools/

      - The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are helpful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are specific indicators considered more 'reliable' than others?

      We have added a section in Results to clarify how to interpret results from MAVISp in the most common use cases.

      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.

      We thank the reviewer for spotting this inconsistency. This part in the main text was left over from a previous and preliminary version of the pre-print, we have revised the main text. Supplementary Text S4 includes the correct reference for the value in light of the benchmarking therewithin.

      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once. The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar. The same applies to the dataset window.

      We have changed the structure of the webserver in such a way that now the whole website opens as its own separate window, instead of being confined within the size permitted by the website at DTU. This solves the fixed window size issue. Hopefully, this will improve the user experience.

      We have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      • You are unable to copy anything out of the tables.
      • Hyperlinks in the tables only seem to work if you open them in a new tab or window.

      The table overhauls fixed both of these issues

      • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).

      We clarified the meaning of the reference column in the Documentation on the MAVISp website, as we realized it had confused the reviewer. The reference column is meant to cite the papers where the computationally-generated MAVISp data are used, not external sources. Since we also have the experimental data module in the most recent release, we have also refactored the MAVISp website by adding a “Datasets and metadata” page, which details metadata for key modules. These include references to data from external sources that we include in MAVISp on a case-by-case basis (for example the results of a MAVE experiment). Additionally, we have verified that the papers using MAVISp data are updated in https://elelab.gitbook.io/mavisp/overview/publications-that-used-mavisp-data and in the csv file of the interested proteins.

      Here below the current references that have been included in terms of publications using MAVISp data:

      SMPD1

      ASM variants in the spotlight: A structure-based atlas for unraveling pathogenic mechanisms in lysosomal acid sphingomyelinase

      Biochim Biophys Acta Mol Basis Dis

      38782304

      https://doi.org/10.1016/j.bbadis.2024.167260

      TRAP1

      Point mutations of the mitochondrial chaperone TRAP1 affect its functions and pro-neoplastic activity

      Cell Death & Disease

      40074754

      https://doi.org/10.1038/s41419-025-07467-6

      BRCA2

      Saturation genome editing-based clinical classification of BRCA2 variants

      Nature

      39779848

      0.1038/s41586-024-08349-1

      TP53, GRIN2A, CBFB, CALR, EGFR

      TRAP1 S-nitrosylation as a model of population-shift mechanism to study the effects of nitric oxide on redox-sensitive oncoproteins

      Cell Death & Disease

      37085483

      10.1038/s41419-023-05780-6

      KIF5A, CFAP410, PILRA, CYP2R1

      Computational analysis of five neurodegenerative diseases reveals shared and specific genetic loci

      Computational and Structural Biotechnology Journal

      38022694

      https://doi.org/10.1016/j.csbj.2023.10.031

      KRAS

      Combining evolution and protein language models for an interpretable cancer driver mutation prediction with D2Deep

      Brief Bioinform

      39708841

      https://doi.org/10.1093/bib/bbae664

      OPTN

      Decoding phospho-regulation and flanking regions in autophagy-associated short linear motifs

      Communications Biology

      40835742

      10.1038/s42003-025-08399-9

      DLG4,GRB2,SMPD1

      Deciphering long-range effects of mutations: an integrated approach using elastic network models and protein structure networks

      JMB

      40738203

      doi: 10.1016/j.jmb.2025.169359

      Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      During the table overhaul, we have revised the user interface to add a text box that allows free copy-pasting of mutation lists. While we understand having a single input box would have been ideal, the former selection interface (which is also still available) doesn’t allow copy-paste. This is a known limitation in Streamlit.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.

      We have done proofreading on the final version of the manuscript

      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.

      Yes, we are aware of this. It is far from trivial to properly import the datasets from multiplex assays. They often need to be treated on a case-by-case basis. We are in the process of carefully compiling locally all the MAVE data before releasing it within the public version of the database, so this is why they are missing. We are giving priorities to the ones that can be correlated with our predictions on changes in structural stability and then we will also cover the rest of the datasets handling them in batches. Having said this, we have checked the dataset for BRCA1, HRAS, and PPARG. We have imported the ones for PPARG and BRCA1 from ProtGym, referring to the studies published in 10.1038/ng.3700 and 10.1038/s41586-018-0461-z, respectively. Whereas for HRAS, checking in details both the available data and literature, while we did identify a suitable dataset (10.7554/eLife.27810), we struggled to understand what a sensible cut-off for discriminating between pathogenic and non-pathogenic variants would be, and so ended up not including it in the MAVISp dataset for now. We will contact the authors to clarify which thresholds to apply before importing the data.

      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.

      In the KRAS case study presented in MAVISP, we utilized the protein abundance dataset reported in (http://dx.doi.org/10.1038/s41586-023-06954-0) and made available in the ProteinGym repository (specifically referenced at https://github.com/OATML-Markslab/ProteinGym/blob/main/reference_files/DMS_substitutions.csv#L153). We adopted the precalculated thresholds as provided by the ProteinGym authors. In this regard, we are not really sure the reviewer is referring to this dataset or another one on KRAS.

      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).

      We improved the description of our classification strategies for both modules in the Documentation page of our website. Also, we explained more clearly the possible sources of ‘uncertain’ annotations for the two modules in both the web app (Documentation page) and main text. Briefly, in the STABILITY module, we consider FoldX and either Rosetta or RaSP to achieve a final classification. We first classify one and the other independently, according to the following strategy:

      If DDG ≥ 3, the mutation is Destabilizing If DDG ≤ −3, the mutation is Stabilizing If −2 We then compare the classifications obtained by the two methods: if they agree, then that is the final classification, if they disagree, then the final classification is Uncertain. The thresholds were selected based on a previous study, in which variants with changes in stability below 3 kcal/mol were not featuring a markedly different abundance at cellular level [10.1371/journal.pgen.1006739, 10.7554/eLife.49138]

      Regarding the LOCAL_INTERACTION module, it works similarly as for the Stability module, in that Rosetta and FoldX are considered independently, and an implicit classification is performed for each, according to the rules (values in kcal/mol)

      If DDG > 1, the mutation is Destabilizing. If DDG Each mutation is therefore classified for both methods. If the methods agree (i.e., if they classify the mutation in the same way), their consensus is the final classification for the mutation; if they do not agree, the final classification will be Uncertain.

      If a mutation does not have an associated free energy value, the relative solvent accessible area is used to classify it: if SAS > 20%, the mutation is classified as Uncertain, otherwise it is not classified.

      Thresholds here were selected according to best practices followed by the tool authors and more in general in the literature, as the reviewer also noticed.

      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?).

      We have revised the statements to avoid this confusion in the reader.

      • Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should be moved to the conclusions/future directions section.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      • Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app.

      The reviewer’s interpretation on the second legend is correct - it does refer to the ClinVar classification. Nonetheless, we understand the positioning of the legend makes understanding what the legend refers to not obvious. We also revised the captions of the figures in the main text. On the web app, we have changed the location of the figure legend for the ClinVar effect category and added a label to make it clear what the classification refers to.

      • "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)" E25Q is benign in ClinVar and has had that status since first submitted.

      We have corrected this in the text and the statements related to it.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports. For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      We appreciate the interest in the gitbook resource that we also see as very valuable and one of the strengths of our work. We have now implemented a new strategy based on a Python script introduced in the mavisp toolkit to generate a template Markdown file of the report that can be further customized and imported into GitBook directly (​​https://github.com/ELELAB/mavisp_accessory_tools/). This should allow us to streamline the production of more reports. We are currently assigning proteins in batches for reporting to biocurator through the mavisp_data_collection GitHub to expand their coverage. Also, we revised the text and added a section on the interpretation of results from MAVISp. with a focus on the utility of the web-app and reports.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      While our website only displays the dataset per protein, the whole dataset, including all the MAVISp entries, is available at our OSF repository (https://osf.io/ufpzm/), which is cited in the paper and linked on the MAVISp website. We have further modified the MAVISp database to add a link to the repository in the modes page, so that it is more visible.

      My expertise. - I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Evidence, reproducibility and clarity:

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work correctly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window. In ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would like to explore the data myself and provide feedback on the user experience and utility.

      We have tried reproducing the issue mentioned by the reviewer, using the exact same Ubuntu and Firefox versions, but unfortunately failed to produce it. The website worked fine for us under such an environment. The issue experienced by the reviewer may have been due to either a temporary issue with the web server or a problem with the specific browser environment they were working in, which we are unable to reproduce. It would be useful to know the date that this happened to verify if it was a downtime on the DTU IT services side that made the webserver inaccessible.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      We appreciate the reviewer’s concerns about long-term sustainability. It is a fair point that we consider within our steering group, who oversee and plans the activities and meet monthly. Adding entries to MAVISp is moving more and more towards automation as we grow. We aim to minimize the manual work where applicable. Still, an expert-based intervention is really needed in some of the steps, and we do not want to renounce it. We intend to keep working on MAVISp to make the process of adding and updating entries as automated as possible, and to streamline the process when manual intervention is necessary. From the point of view of the biocurators, they have three core workflows to use for the default modules, which also automatically cover the source of annotations. We are currently working to streamline the procedures behind LOCAL_INTERACTION, which is the most challenging one. On the data manager and maintainers' side, we have workflows and protocols that help us in terms of automation, quality control, etc, and we keep working to improve them. Among these, we have workflows to use for the old entries updates. As an example, the update of erroneously attributed RefSeq data (pointed out by reviewer 2) took us only one week overall (from assigning revisions and importing to the database) because we have a reduced version of Snakemake for automation that can act on only the affected modules. Also, another point is that we have streamlined the generation of the templates for the gitbook reports (see also answer to reviewer 2).

      The update of old entries is planned and made regularly. We also deposit the old datasets on OSF for transparency, in case someone needs to navigate and explore the changes. We have activities planned between May and August every year to update the old entries in relation to changes of protocols in the modules, updates in the core databases that we interact with (COSMIC, Clinvar etc). In case of major changes, the activities for updates continue in the Fall. Other revisions can happen outside these time windows if an entry is needed or a specific research project and needs updates too.

      Furthermore, the community of people contributing to MAVISp as biocurators or developers is growing and we have scientists contributing from other groups in relation to their research interest. We envision that for this resource to scale up, our team cannot be the only one producing data and depositing it to the database. To facilitate this we launched a pilot for a training event online (see Event page on the website) and we will repeat it once per year. We also organize regular meetings with all the active curators and developers to plan the activities in a sustainable manner and address the challenges we encounter.

      As stated in the manuscript, currently with the team of people involved, automatization and resources that we have gathered around this initiative we can provide updates to the public database every third month and we have been regularly satisfied with them. Additionally, we are capable of processing from 20 to 40 proteins every month depending also on the needs of revision or expansion of analyses on existing proteins. We also depend on these data for our own research projects and we are fully committed to it.

      Additionally, we are planning future activities in these directions to improve scale up and sustainability:

      • Streamlining manual steps so that they are as convenient as fast as possible for our curators, e.g. by providing custom pages on the MAVISp website
      • Streamline and automatize the generation of useful output, for instance the reports, by using a combination of simple automation and large language models
      • Implement ways to share our software and scripts with third parties, for instance by providing ready made (or close to) containers or virtual machines
      • For a future version 2 if the database grows in a direction that is not compatible with Streamlit, the web data science framework we are currently using, we will rewrite the website using a framework that would allow better flexibility and performance, for instance using Django and a proper database backend. On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      We thank the reviewer for this comment - we are aware of the upcoming EOL of Python 3.9. We tested MAVISp, both software package and web server, using Python 3.10 (which is the minimum supported version going forward) and Python 3.13 (which is the latest stable release at the time of writing) and updated the instructions in the README file on the MAVISp GitHub repository accordingly.

      We plan on keeping track of Python and library versions during our testing and updating them when necessary. In the future, we also plan to deploy Continuous Integration with automated testing for our repository, making this process easier and more standardized.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      Since 2024, we have been reporting all previous versions of the dataset on OSF, the repository linked to the MAVISp website, at https://osf.io/ufpzm/files/osfstorage (folder: previous_releases). We prefer to keep everything under OSF, as we also use it to deposit, for example, the MD trajectory data.

      Additionally, in this GitHub page that we use as a space to interact between biocurators, developers, and data managers within the MAVISp community, we also report all the changes in the NEWS space: https://github.com/ELELAB/mavisp_data_collection

      Finally, the individual tools are all available in our GitHub repository, where version control is in place (see Table S1, where we now mapped all the resources used in the framework)

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. They should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      We revised the introduction in light of these suggestions. We have split the paragraph as recommended and added a longer second paragraph about VEPs and using structural data in the context of VEPs. We have also added the citation that the reviewer kindly recommended.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we can classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      We revised the statement in light of this comment from the reviewer

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      We have revised the text making the two intervals explicit, for better clarity.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset, and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      We have included the data from Mighell’s phosphatase assay as provided by MAVEdb in the MAVISp database, within the experimental_data module for PTEN, and we have revised the case study, including them and explaining better the decision of supporting both the ProteinGym and MAVEdb classification in MAVISp (when available). See revised Figure3, Table 1 and corresponding text.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      The reviewer is correct, we have revised the terminology we used in the manuscript and refers to VEPs (Variant Effect Predictors)

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      We have revised the website, adding a filtering option. In detail, we have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name, or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      We have revised and updated the data sources on the website, adding a metadata section with relevant information, including MaveDB references where applicable.

      Figure 2 is somewhat confusing, as it partially interleaves results from two different proteins. This would be nicer as two separate figures, one on each protein, or just of a single protein.

      As suggested by the reviewer, we have now revised the figure and corresponding legends and text, focusing only on one of the two proteins.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      We have revised Figure 3 to solve these issues and integrating new data from the comparison with the phosphatase assay

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      We have carefully proofread the paper for these inconsistencies

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      We have added the reference that the reviewer recommended

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      The assay mentioned in the paper refers to an experimental setup designed to investigate mutations that may confer resistance to the drug venetoclax. We started the first steps to implement a MAVISp module aimed at evaluating the impact of mutations on drug binding using alchemical free energy perturbations (ensemble mode) but we are far from having it complete. We expect to import these data when the module will be finalized since they can be used to benchmark it and BCL2 is one of the proteins that we are using to develop and test the new module.

      Reviewer #3 (Significance (Required)):

      Significance:

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      We have expanded the conclusions section to add a comparison and cite previously published work, and linked to a review we published last year that frames MAVISp in the context of computational frameworks for the prediction of variant effects. In brief, the Genomics 2 Proteins portal (G2P) includes data from several sources, including some overlapping with MAVISp such as Phosphosite or MAVEdb, as well as features calculated on the protein structure. ProtVar also aggregates mutations from different sources and includes both variant effect predictors and predictions of changes in stability upon mutation, as well as predictions of complex structures. These approaches are only partially overlapping with MAVISp. G2P is primarily focused on structural and other annotations of the effect of a mutation; it doesn’t include features about changes of stability, binding, or long-range effects, and doesn’t attempt to classify the impact of a mutation according to its measurements. It also doesn’t include information on protein dynamics. Similarly, ProtVar does include information on binding free energies, long effects, or dynamical information.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work properly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window, and in ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would have liked to be able to explore the data myself and provide feedback on the user experience and utility.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. The y should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we are able to classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      I found Figure 2 to be a bit confusing in that it partially interleaves results from two different proteins. I think this would be nicer as two separate figures, one on each protein, or just of a single protein.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      Significance

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments:

      • On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.
      • The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are useful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are certain indicators considered more 'reliable' than others?
      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.
      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once.
        • The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar.
        • The same applies to the dataset window.
        • You are unable to copy anything out of the tables.
        • Hyperlinks in the tables only seem to work if you open them in a new tab or window.
        • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).
        • Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.
      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.
      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.
      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).
      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?). - Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should probably be moved to the conclusions/future directions section. - Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app. - "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)"

      E25Q is benign in ClinVar and has had that status since first submitted.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports.

      For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      My expertise.

      • I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments:

      • While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.
      • The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each. OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.
      • The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.
      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.
      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      Minor comments:

      • Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.
      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.
      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified:

      Page 3, line 46: "MAVISp perform" -> "MAVISp performs"

      Page 3, line 56: "automatically as embedded" -> "automatically embedded"

      Page 3, line 57: "along with to enhance" -> unclear; please revise

      Page 4, line 96: "web app interfaces with the database and present" -> "presents"

      Page 6, line 210: "to investigate wheatear" -> "whether"

      Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify

      Page 15, line 446: "Both the approaches" -> "Both approaches"

      Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance: to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience: this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary:

      Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.


      Major comments:

      1.) It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells

      We appreciate the reviewer's thoughtful suggestion to compare non-transformed and transformed cell lines to evaluate importin α1 localization in MN. Given that HeLa cells are derived from cervical cancer rather than the mammary epithelium, we considered it inappropriate to directly compare them with non-transformed mammary epithelial MCF10A cells. Therefore, HeLa cells were analyzed separately to assess the effects of reversine treatment on importin α1 localization. The results indicated no significant difference between the treated and untreated HeLa cells. (Supplemental Fig. S2F in the revised manuscript). Regarding the comparison between MCF10A and the two cancer cell lines, MCF7 and MDA-MB-231, the proportion of importin α1-positive MN did not significantly differ across the cell lines, regardless of reversine treatment (Supplemental Fig. S3B, Untreated: p = 0.9850 and 0.5533; Reversine: p = 0.2218 and 0.9392). These results suggest that there is no clear difference in the localization of importin α1 in MN between the transformed and non-transformed cell lines tested. However, we acknowledge that this does not exclude the possibility that importin α1 localization to MN is linked to genomic instability under specific conditions.

      2.) While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells.

      We thank the reviewer for the constructive suggestion to quantify nuclear envelope integrity more comprehensively. In response, we compared laminB1 localization at the MN membrane between importin α1-positive and -negative MN in MCF10A, MCF7, MDA-MB-231, and HeLa cells, and included these results in the revised manuscript (Fig. 4C). For each cell, the laminB1 intensity in the MN was normalized to that of the primary nucleus (PN). This analysis showed that laminB1 intensity was significantly lower in importin α1-positive MN across all cell lines, including non-transformed MCF10A cells. These findings support a close association between aberrant importin α1 accumulation and compromised nuclear envelope integrity, a key factor potentially linking MN to chromothripsis and cGAS-STING-mediated genomic instability.

      3.) The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889).

      We agree that the previous schematic illustration (former Fig. 8) did not adequately summarize our findings and may have overstated our conclusions. Accordingly, we have removed this figure from the revised manuscript.

      To address the reviewer's concern, we performed additional analyses and included the results in the new Figure 8. These data show that, in addition to RAD51, both RPA2 and cGAS display mutually exclusive localization with importin α1 in MN. RPA2, a single-stranded DNA-binding protein, stabilizes damaged DNA and enables RAD51 filament assembly during homologous recombination repair. Previous studies have demonstrated that RPA2 accumulates in ruptured MN in a CHMP4B-dependent manner (PMID: 32601372). Likewise, cGAS is a cytosolic DNA sensor that localizes to ruptured MN and activates innate immune signaling through the cGAS-STING pathway, as widely reported (PMID: 28738408; 28759889; see also PMID: 32494070; 27918550).

      Our findings suggest an alternative scenario: even when nuclear envelope rupture occurs, importin α1-positive MN may remain inaccessible to DNA repair and sensing factors such as RPA2 and cGAS. This supports the view that importin α1 defines a distinct MN subset, separate from those characterized by the canonical DNA damage response or innate immune signaling factors. Furthermore, our overexpression experiments with EGFP-importin α1 (Fig. 7G, 7H) raises the possibility that importin α1 enrichment may impede the recruitment of DNA-binding proteins.

      Taken together, these results support the conclusion that importin α1 marks a unique MN state and provides a molecular framework for distinguishing between different MN environments. At the reviewer's suggestion, we have cited all the recommended references (PMID: 32601372, 32494070, 27918550, 28738408, and 28759889) in the revised manuscript to better contextualize our findings. We are grateful for the reviewer's thoughtful suggestions and literature recommendations, which helped us clarify the implications of our findings within the broader context of chromothripsis and cGAS-STING-mediated genomic instability.

      4.) Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      We sincerely thank the reviewer for pointing out the important limitations of the original version of Fig. 4D, as also raised in minor comment #5. As the reviewer correctly noted, this figure was intended to demonstrate that importin-α1 preferentially localizes to euchromatin regions (H3K4me3 and H3K36me3) rather than heterochromatin (H3K9me3 and H3K27me3). However, we acknowledge that in the original figure, the predominantly blue tone of the heatmap made this interpretation unclear and that the Spearman's correlation coefficient for H3K36me3 was missing. In response, we have substantially revised the figure (now shown as Fig. 5E in the revised manuscript). Specifically, we improved the color scale for better visual distinction, added the missing Spearman's coefficients for H3K36me3, and strengthened the analysis by incorporating ChIP-seq data obtained with two independent antibodies against importin α1 (Ab1 and Ab2). We believe that these revisions provide a clear and more accurate representation of euchromatin enrichment of importin-α1, as originally intended.

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Following the reviewer's suggestion, we carefully revised the manuscript to ensure that our statements are consistent with the scope of the data and do not overstate our conclusions. As part of this effort, we removed the schematic illustration (former Fig. 8), which might have overstated our findings, and refined the relevant text to prevent overinterpretation.

      To our knowledge, this study is the first to report the specific accumulation of importin α in MN. Our results suggest a previously unrecognized function of importin α beyond its canonical transport role and add to the growing list of nuclear proteins that exhibit abnormal behavior in MN. We hope that these findings will provide a conceptual and experimental basis for future studies aimed at clarifying the biological significance of MN heterogeneity and quality control in cancer biology.


      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells.

      As part of our response to Major Comment 1, we conducted additional experiments to quantitatively compare importin α1 localization in MN between non-transformed MCF10A cells, breast cancer cell lines (MCF7 and MDA-MB-231), and HeLa cells. These results have been included in the revised manuscript (Supplemental Fig. S2F and Fig. S3B). The analyses showed no significant differences in the proportion of importin α1-positive MN among these cell lines, consistent with the reviewer's request for a more comprehensive evaluation.

      The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      We have quantified the co-localization of importin α1 with the euchromatin marker H3K4me3 and the heterochromatin marker H3K9me3 in micronuclei (MN) across four human cell lines (MCF10A, MCF7, MDA-MB-231, and HeLa). The results of this statistical analysis are included in the revised manuscript in Fig. 5C. These data provide quantitative evidence from independent experiments showing that importin α1 preferentially localizes to euchromatic regions within the MN, thereby supporting our initial observation.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN.

      As noted, these experiments were performed on whole-cell populations of MCF7 cells and therefore reflect the overall chromatin landscape, not specifically that of the MN. We fully acknowledge that MN constitute only a small fraction of the cell population under standard culture conditions (Supplemental Fig. S2D), and thus, the relevance of ChIP-seq data to MN must be interpreted with caution.

      Nevertheless, our intention in presenting these data was to illustrate that importin α1 preferentially associates with euchromatin regions marked by H3K4me3. To examine this more directly, we analyzed importin α1 localization in MN using immunofluorescence with histone modification markers across multiple cell lines. These analyses, together with the quantitative results now included in the revised manuscript (Fig. 5C), confirming that importin α1 preferentially localizes to euchromatic regions within MN.

      Taken together, although the ChIP-seq data were derived from whole-cell populations, the combined results from IF imaging and quantitative analysis support our interpretation that importin α1 retains its euchromatin-associating property within MN. We hope that these additional data will address the reviewer's concerns.

      To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Following the reviewer's suggestion, we have added a new graph (Fig. 7F) in the revised manuscript. This figure presents the quantified frequency of RAD51-positive MN among importin α1-negative and importin α1-positive MN, analyzed across six microscopy fields (n = 6) from three independent experiments.

      To improve clarity and consistency, we reorganized the panels: representative RAD51 images are now shown in Fig. 7B, and the Cell #1 (low RAD51) vs. Cell #2 (high RAD51) classification with etoposide responsiveness is summarized in Fig. 7C. As illustrated in Figs. 7D and 7E, importin α1 and RAD51 exhibit mutually exclusive localization in MN. Fig. 7F provides a unified statistical summary at the population level.

      The results showed that the proportion of RAD51-positive MN was significantly lower among importin α1-positive MN than among importin α1-negative MN, providing robust quantitative support for the proposed mutual exclusivity between importin α1 localization and RAD51 accessibility in MN.

      We are grateful to the reviewer for this constructive suggestion, which helped us clarify and better support the central message of our study.


      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      We appreciate the reviewer's thoughtful consideration of the feasibility of the additional experiments.

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Following the reviewer's suggestion, we have substantially revised the Materials and Methods sections in the main and supplemental manuscripts to provide detailed descriptions of the optical microscopy procedures, including the specifications of the imaging equipment, acquisition settings, and image processing parameters. These revisions follow the best practices recommended by Heddleston et al. (2021, J. Cell Sci., doi:10.1242/jcs.254144).

      We have also expanded the description of our quantitative image analysis using ImageJ, providing details on the parameters for MN identification and the measurement of colocalization rates between importin α and histone modifications. These additions ensured reproducibility and clarity.

      We believe that these modifications will enhance the reproducibility of our results and increase the value of our study for the research community. We sincerely appreciate the reviewer's helpful suggestions.


      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      We sincerely appreciate the reviewer's constructive comments highlighting the importance of transparent and rigorous statistical analyses. In response, we have carefully revised all figure panels, figure legends, and the Materials and Methods (Statistical Analysis) section in both the main and the supplementary manuscripts.

      In the revised figure legends, we now provide the number of independent experiments and sample sizes (n), statistical tests applied (e.g., unpaired or paired two-tailed t-test, one-way ANOVA with Tukey's post-hoc test, two-way ANOVA with Sidak's multiple comparisons), data presentation format (mean {plus minus} SD), and corresponding p-values or significance indicators (*, **, ***). The Statistical Analysis section was also expanded to explain the rationale for selecting each statistical test, the criteria for significance, and the reporting of the replicates. These revisions ensure clarity, reproducibility, and transparency throughout the manuscript, directly addressing the reviewers' concerns. We are grateful for this valuable suggestion, which has significantly improved the rigor of our study.

      Minor comments:

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      We thank the reviewer for their constructive suggestions regarding our FRAP analysis. To address the concern that the original comparison between PN and the micronuclei (MN) might have been biased by differences in bleaching areas, we performed new experiments in which both PN and MN were fully bleached within the same cells (Fig. 3A, and 3C). This approach allowed for a more direct comparison of importin α1 dynamics under equivalent conditions.

      These experiments revealed a markedly slower fluorescence recovery in MN than in PN, indicating reduced nuclear import and/or recycling efficiency of importin α1 in MN. In addition, we retained our original analysis to further characterize the heterogeneous mobility patterns of importin α1 in MN, identifying three distinct mobility classes: high, intermediate, and low (Fig. 3B, and 3D). Together, these results support our observation that importin α1 mobility is restricted in MN, likely due to altered nuclear transport dynamics.

      As suggested by the reviewer, we attempted partial bleaching of MN to assess intranuclear mobility. However, owing to the small size of MN, partial bleaching is technically challenging and inconsistent, with some MN recovering even during the bleaching process. Therefore, reliable quantification was not possible. For transparency, these data are provided as a Reviewer-only Figure but were not included in the revised manuscript.

      Finally, while we agree that examining other nuclear transport factors (e.g., RAN, CAS, RCC1) would be informative, our study focused on importin α1 dynamics. We consider these additional factors to be important directions for future investigations.


      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      We thank the reviewer for carefully pointing out the key references that are highly relevant to framing our findings in the context of previous studies on micronuclear instability, chromothripsis and inflammation. We fully agree with this suggestion.

      In the revised manuscript, we have cited these studies in both the Introduction and Discussion sections. Specifically, we incorporated these studies when discussing the structural fragility of MN, aberrant DNA replication, and the exposure of micronuclear DNA to cytoplasmic sensors, which mechanistically link MN rupture to chromothripsis and cGAS-STING-mediated immune activation. For example, we now refer to the study demonstrating RPA2 recruitment to ruptured MN in a CHMP4B-dependent manner (PMID: 32601372), reports showing defective replication and DNA damage responses in MN (PMID: 32494070; 27918550), and seminal studies establishing cGAS localization to ruptured MN and activation of innate immune signaling (PMID: 28738408; 28759889).

      By incorporating these references, we more clearly position our findings that importin α1 defines a distinct subset of MN lacking access to DNA repair and sensing factors such as RAD51, RPA2, and cGAS. This contextualization emphasizes that our data add to and extend the established view that compromised MN integrity underlies chromothripsis and inflammation by identifying importin α1 as a novel marker of an alternative MN microenvironment. We are grateful for this constructive recommendation, which has allowed us to strengthen the framing of our study in the existing literature.


      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      In response, we have revised the relevant figure panels and their legends to clearly display the statistical significance, including p-values, where appropriate. Specifically, we added statistical annotations (p-values or significance markers such as asterisks) directly on the plots or in the corresponding legends, and clarified the number of replicates, statistical tests used, and definitions of error bars (mean {plus minus} SD). We believe that these revisions improve the interpretability and transparency of our results and strengthen the overall presentation of the data.

      __ 1.) In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.__

      Following the reviewer's comments, we revised Supplemental Fig. S2F shows a direct comparison of the proportion of importin α1-positive MN between untreated and reversine-treated HeLa cells based on indirect IF analysis. The Results section was updated accordingly (page 8, Lines 148-150): "We then examined whether reversine treatment affected the proportion of importin α1-positive MN. The results revealed that the MN formation rate for either untreated or treated cells was 36.2% {plus minus} 7.8 or 38.3% {plus minus} 8.8, respectively, with no significant difference (Fig. S2F). "

      We believe that this revision addresses the reviewer's concern by providing relevant quantitative data for the untreated condition.

      2.) In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.

      In this section, we aimed to clarify that the quantitative analysis focused exclusively on cells harboring MN, as the purpose of the analysis was to compare the localization of EGFP-importin α1 between MN and PN. We excluded cells that contained no MN and showed EGFP-importin α1 localization only in the PN. This criterion was consistently applied to both wild-type and mutant constructs. To avoid confusion, we have removed the sentence "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." from the revised manuscript.

      3.) The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.

      We agree with the reviewer that the statement "However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)" was not essential for understanding the rationale of our study and could be misleading. In response, we have removed this sentence from the revised manuscript, along with the corresponding Supplementary Fig. S4.

      4.) Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.

      In the revised manuscript, we performed one-way ANOVA followed by Holm-Sidak's multiple comparisons test to evaluate the MN localization ratio of EGFP-NES between Imp-α1-negative and Imp-α1-positive MN. This analysis revealed a statistically significant difference (**p

      5.) In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.

      We thank the reviewer for this insightful comment. As addressed in response to Major comment #4, we have substantially revised Fig. 5 and added the missing Spearman's correlation coefficient value for H3K36me3 (now shown in Fig. 5E). These revisions, together with the overall improvements to the figure, more clearly illustrate the euchromatin enrichment of importin-α1.

      6.) For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.

      We sincerely apologize for the misstatements in the legends of the original Fig. 5C. The correct description is that this experiment was performed using MCF7 cells, and we have revised the legend accordingly in the revised manuscript (now Fig. 6C). In addition, because the original data in Fig. 5D were obtained from HeLa cells, we repeated this experiment using MCF7 cells and replaced the panel with new data (now Fig. 6D).

      7.) To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.

      As described above, we addressed this point by adding a new quantification and statistical analysis in Fig. 7F, based on six microscopy fields across three independent experiments. This analysis directly supports our claim that importin α1 inhibits RAD51 accessibility in the MN.

      We would also like to clarify that although the reviewer referred to Figs 7D and 7E, these two panels were designed to illustrate the same phenomenon-the mutually exclusive localization of importin α1 and RAD51 to distinct MN-shown in different contexts. Specifically, Fig. 7D presents examples from separate cells, each with MN containing either importin α1 or RAD51, while Fig. 7E shows a single cell containing two distinct MN, one enriched with importin α1 and the other with RAD51. Because both panels serve as illustrative examples of the same phenomenon, it would not be meaningful to quantify them independently as parallel datasets. Instead, we integrated the statistical analysis into a unified graph (Fig. 7F), which summarizes the frequency of RAD51-positive MN in relation to importin α1 status across the cell population, thereby supporting our interpretation that importin α1-positive MN represent a distinct subset that is less accessible to RAD51.

      8.) The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.

      We appreciate the reviewer's comment regarding the clarity of our statement in the Discussion (former lines 336-338). We agree that the original phrasing is ambiguous. To improve clarity and align with our results, we revised this section to emphasize that importin α1-positive MN represent a restricted environment from which DNA repair and sensing factors are excluded. Specifically, RAD51, RPA2, and cGAS showed mutually exclusive localization with importin α1, indicating that these MN are largely inaccessible to DNA-binding proteins (pages 20-21). This rephrasing removes the unclear phrase "protect its activity" and directly reflects our experimental findings, presenting a clearer interpretation that is consistent with the Results.

      9.) Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small

      We appreciate the reviewer's careful examination of the figure. In the revised manuscript, we added numerical tick labels to both the x- and y-axes and increased the label font size to ensure clear readability, as shown in Fig. 1D. We also applied the same improvements to other fluorescence intensity plots, including Figs. 4A, 4B, 5A, 5B, 7H, and Supplemental Fig. S4C and S5A-S5F to ensure consistency in readability across the manuscript. We thank the reviewer for helping us improve the clarity and accuracy of our figure presentations.

      10.) Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Upon re-examination of the source data, we identified and corrected a minor calculation error in one subset and regenerated the panel. After correction, the three independent PN/MN ratios were 3.1%, 2.9%, and 2.6%, rather than being identical. These corrected values were proportional to the corresponding PN and MN measurements and preserved the expected relationship between their distributions. Although the numerical differences were small, they demonstrated high reproducibility across independent experiments. These corrections do not alter the interpretation of Fig. 1F, and the distribution of PN/MN values is now consistent with the paired PN and MN data presented in the revised manuscript.

      Significance Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways.

      The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A).

      While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking.

      Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability.

      We sincerely appreciate the reviewer's thoughtful and constructive evaluation of the significance of our study. We agree that in the original submission, the conceptual contribution was not fully supported by sufficient evidence. In the revised manuscript, we have substantially strengthened our findings by incorporating new data on RPA2 and cGAS, in addition to RAD51. These results consistently show that importin α1-positive MN are largely inaccessible to multiple DNA-recognizing proteins-including DNA repair factors (RAD51 and RPA2) and the innate immune sensor cGAS-whereas importin α1-negative MN readily recruit these proteins. This broader dataset reinforces the concept that importin α defines a distinct and restricted MN subset, extending beyond our initial observation of RAD51 exclusion.

      By framing importin α as a molecular marker that discriminates between functionally distinct MN environments, our study conceptually advances the understanding of MN heterogeneity. This adds to the prior literature showing that defective nuclear envelope integrity underlies chromothripsis and cGAS-STING activation and positions importin α as a new marker for identifying MN that are refractory to these DNA repair and sensing pathways. While we agree that further work is necessary to directly link importin α enrichment to downstream genomic instability or inflammation in cancer, we believe that our revised data now provide a robust foundation for future investigations.

      Taken together, the revised manuscript presents a clearer and more comprehensive conceptual advance: importin α-positive MN represents a previously unrecognized molecular environment distinct from MN characterized by canonical DNA repair or sensing factors. We are grateful to the reviewer, whose constructive comments greatly improved the clarity, robustness, and overall impact of our study. We believe that these findings will be of particular interest to researchers studying the mechanisms of genomic instability, chromothripsis, and cancer biology.


      Reviewer #2

      Summary:

      The authors have shown that Importin α1, a nuclear transport factor, is enriched in subsets of micronuclei (MN) of cancer cells (MCF7 and HeLa) and, using FRAP, has an altered dynamics in MN. Moreover, the authors have shown that these levels of Importin α1 in the MN are likely not due to its traditional role for signal-dependent protein transport, as suggested by immunofluorescence of other factors important for this function. Additionally, cargo dynamics carrying NLS or NES signals were disrupted in Importin α1-positive micronuclei. Importin α1-positive micronuclei also appear to have a disrupted nuclear envelope, potentially explaining some of these cargo disruptions. The authors also demonstrated that Importin α colocalizes with proteins important for DNA replication, and p53 signaling using RIME, followed by immunofluorescence. Lastly, the authors show that Importin α and RAD51 have mutual exclusivity in the micronuclei.

      Major comments:

      1) A key issue is there are very few statistical tests used in this study. It is crucial to the interpretation of the data. We strongly urge the authors to re-analyze the data using appropriate statistical analyses. Along those lines, in many figures 1 or 2 images are shown without stating how many biological or technical replicates this is representative of or showing quantification of the anlyses. In general, the authors' statements would be strengthened by showing more examples and/or stating "N" in the figure legends or supplement.

      We sincerely thank the reviewer for emphasizing the importance of including sufficient statistical analyses and replication information. As noted in our response to Reviewer #1, we have carefully revised the manuscript to enhance statistical rigor and transparency throughout. Specifically, we expanded the Statistical Analysis section in the Materials and Methods section to provide a clear description of the statistical approaches used. In addition, all figure legends have been revised to explicitly state the number of biological replicates, sample sizes, statistical tests applied, and corresponding p-values or significance indicators. Representative images are consistently accompanied by quantitative analyses derived from multiple independent experiments.

      We believe that these comprehensive revisions directly address the reviewer's concerns and substantially improve the rigor, clarity, and interpretability of our manuscript.

      2) Using RIME and immunofluorescence, the authors identify factors that co-localize with Importin α1 in subsets of micronuclei (Figure 5), which is interesting, but there is no functional data associated with this result. Are the authors stating that these differences account for altered DNA damage or replication? It is unclear what the conclusion is beyond "some MN are different than others." Could the authors knockdown/knockout these factors to determine if they recruit Importin α1 into MN or the reciprocal? For many of these factors, they appear to be broadly present throughout the entire primary nucleus as well, indicating there is nothing unique about their MN localization.

      We agree that our original RIME and indirect IF analyses were primarily descriptive and lacked functional validation. To strengthen this aspect, we added new IF and quantification data (now presented in Fig. 8) showing that importin α1-positive MN are largely mutually exclusive with DNA repair and sensing factors such as RAD51, RPA2, and cGAS, whereas importin α1 frequently co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings indicate that importin α1-positive MN define a distinct molecular environment enriched in replication- and chromatin-associated regulators but inaccessible to canonical DNA repair and sensing proteins.

      This combination of mutual exclusivity with DNA repair/sensing factors and frequent co-localization with chromatin regulators underscores the biological significance of importin α1 localization in MN, as it may contribute to localized chromatin stabilization through association with chromatin regulators while simultaneously restricting access to DNA repair and sensing factors. Thus, importin α1-positive MN represent a restricted subset with potential implications for genome stability and immune signaling, going beyond the descriptive notion that "some MN are different than others."

      Moreover, many chromatin regulators identified by RIME contain classical nuclear localization signals (NLSs), raising the possibility that importin α1 interacts with these proteins via their NLS sequences. We fully agree with the reviewer that knockdown or knockout experiments would be highly valuable to clarify whether such interactions actively recruit importin α1 into MN or occur reciprocally, and we regard this as an important direction for future investigations.

      3) In line 274, the authors state that MN highly enriched for Importin α1 inhibits RAD51 accessibility but this is an overstatement of the data. Instead, the authors show that RAD51 and importin α1 do not colocalize in micronuclei, albeit without quantification which weakens their argument. Also, the consequence of this "mutual exclusivity" is unclear. Can the authors inhibit or knockdown Importin α1 and show that RAD51 goes to all micronuclei? And how is this different than the data shown for factors in Figure 5? Some of those show colocalization with Importin α1-positive micronuclei and others do not. Could you perform live imaging of labeled Importin a1 and RAD51 and show that as Importin α1 accumulates in MN that RAD51 or other DNA repair factors are exported? An alternative experiment would be to show that the C-mutant, which is defective in nuclear export, now colocalizes with RAD51 in MN. Please reconcile this or show experiments to prove the statement above.

      We agree that our original wording "inhibits RAD51 accessibility" was not sufficiently supported by direct evidence, as it was based solely on the immunofluorescence data. Therefore, we have removed this statement from the Results section of the revised manuscript. To strengthen this point, we added a quantitative analysis (Fig. 7F) showing that RAD51 signals were significantly reduced in importin α1-enriched MN.

      Regarding the suggestion to perform knockdown experiments, we note that the depletion of KPNA2 (gene name of importin α1) has been reported to cause severe cell-cycle arrest (Martinez-Olivera et al, 2018; Wang et al, 2012). Consistent with these reports, we also found that siRNA-mediated knockdown of KPNA2 in our system strongly reduced MN induction upon reversine treatment, making it technically unfeasible to analyze RAD51 localization under these conditions. We also sincerely thank the reviewer for suggesting the live imaging experiments. We fully agree that such experiments would provide valuable mechanistic insights, and we regard this as an important direction for future research.

      In addition, to address the reviewer's concern about other DNA repair factors, we added new data (Fig. 8) showing that importin α1-positive MN are mutually exclusive with RPA2 and cGAS. RPA2 is a canonical single-strand DNA (ssDNA)-binding protein that stabilizes exposed ssDNA and facilitates RAD51 recruitment. It has been reported to accumulate in ruptured MN in a CHMP4B-dependent manner (Vietri et al, 2020). cGAS is a cytosolic DNA sensor that detects ruptured MN and activates innate immune signaling via the cGAS-STING pathway. Together with our RAD51 results, these data show that importin α1-positive MN are consistently segregated from multiple DNA-recognizing factors, including RAD51. Simultaneously, importin α1 co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings support the view that importin α1-positive MN define a distinct molecular environment enriched in chromatin regulators but largely inaccessible to DNA repair and sensing factors. While the precise mechanism remains unclear, one possibility is that importin α1-associated chromatin interactions limit the access of DNA repair and sensing proteins. However, this interpretation is speculative and requires further investigation.

      4) In the Discussion, line 343-344 states that "importin α1 is uniquely distributed and alters the nuclear/chromatin status when enriched in MN," however this is not currently supported by the present data. The data presented shows correlation (albeit weak) between euchromatic modifications and Importin α1, and it does not definitively show that importin α1 is sufficient to alter the nuclear-chromatin status when enriched in the MN. More substantial experiments would be required to show whether Importin α1 plays an active role in these modifications.

      Following the reviewer's suggestion, in the revised manuscript, we removed this overstatement and rephrased the relevant sections of the Discussion. Rather than implying a causal role, we now describe the mutually exclusive localization of importin α1 with DNA repair and sensing factors (RAD51, RPA2, and cGAS), emphasize its preferential association with euchromatin regions marked by H3K4me3, and note its frequent co-localization with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings suggest that importin α1-positive MN define a distinct subset characterized by limited accessibility to DNA repair and sensing proteins, whereas cGAS-positive ruptured MN exemplify a state in which these proteins can accumulate.

      We also added a concluding statement that frames importin α1 as defining a previously unrecognized MN subset that is distinct from conventional ruptured MN. This revision provides a more accurate and appropriately cautious interpretation of our data while underscoring the conceptual advance of our study by clarifying how importin α1 localization reveals MN heterogeneity.

      Minor Comments

      1) Summary statement (page 3 Line 40): The use of "their" is confusing. Whose microenvironment are you referring to?

      We have rephrased the sentence as follows: The accumulation of importin α in micronuclei, followed by modulation of the microenvironment of the micronuclei, suggests the non-canonical function of importin α in genomic instability and cancer development. Thank you for this useful suggestion.

      2) In Abstract and introduction (page 4, Line 44 and page 5, line 59) it states that MN are membrane enclosed structures, but this is not always the case (see https://doi.org/10.1038/nature23449 as one example).

      While MN are typically surrounded by a nuclear envelope at the time of their formation during mitosis, we agree that this envelope can later rupture or fail to assemble completely, thereby exposing micronuclear DNA to the cytoplasm. To clarify this point, we revised the Introduction to explicitly acknowledge that MN may lose nuclear envelope integrity, which can have important consequences for genomic instability and immune activation inflammation. Specifically, we have added the following sentence to the Introduction (page 4, lines 77-80): "The nuclear envelope of MN can be partially or completely disrupted, allowing cytoplasmic DNA sensors, such as cyclic GMP-AMP synthase (cGAS), to access micronuclear DNA and trigger innate immune responses via the cGAS-STING pathway (Harding et al, 2017; Li & Chen, 2018; Mackenzie et al, 2017). "

      We hope this addition appropriately addresses the concerns raised by Reviewer #2 while incorporating the valuable suggestions from Reviewer #1 without altering the overall structure and flow of the manuscript.

      3) Given the fact that the RIME result identified proteins involved in DNA replication to be enriched with Importin α1, are these MN enriched in factors described in Fig. 5 simply localizing to MN that are in S phase, as described previously (doi: 10.1038/nature10802)?

      We sincerely thank the reviewer for raising this constructive perspective regarding the potential relationship between importin α1 enrichment in micronuclei (MN) and the S phase. Our RIME analysis identified chromatin-associated proteins, such as PARP1 and SUPT16H/FACT, which are often activated during replication stress and frequently function in the S phase. However, importin α1-positive MN were not exclusively associated with S-phase-specific molecules, and our data do not indicate that these MN are restricted to the S phase.

      Previous studies [e.g., (Crasta et al, 2012)] have established that MN are prone to replication defects and represent hotspots of genomic instability. The recovery of replication stress-responsive molecules, such as PARP1 and FACT, by RIME is therefore consistent with the biology of MN. Based on this valuable suggestion, we have revised the Discussion (page 19) to explicitly mention the potential involvement of replication-related proteins in importin α1-positive MN, as well as the possibility that importin α1 accumulation may contribute to replication defects in these structures. We are grateful to the reviewer for raising this important perspective, which has enabled us to place our findings in a broader mechanistic context.

      We are grateful to the reviewer for this important comment, which has allowed us to place our findings in a broader mechanistic context and outline directions for future research, including testing the relationship between importin α1-positive MN and established S-phase markers such as PCNA.

      4) The FRAP data is not very compelling. While it is clear there are differences between the PN and MN dynamics, what is driving these differences? Are these differences meaningful to the biology of the MN or PN? It is unclear what this data is contributing to the conclusions of the paper. Also, if the mobility of the MN is plotted on the same graph as the PN, the differences in MN mobility might not look as compelling.

      We respectfully emphasize that FRAP analysis is a key component of our study, as it provides important insights into the distinct dynamics of importin α1 in MN compared to PN.

      In the revised manuscript, we included new experiments (now shown in Fig. 3A and 3C) that directly compare the recovery kinetics of importin α1 in PN and MN in the same cells. By plotting the PN and MN recovery curves side by side, we aimed to improve clarity and provide a direct visualization of the pronounced differences in importin α1 dynamics between these compartments.

      Our FRAP results showed that importin α1 accumulated in both PN and MN but exhibited markedly reduced mobility in MN. These findings suggest that, unlike in the PN, canonical nucleocytoplasmic recycling of importin α1 is impaired in MN. Furthermore, the reduced mobility indicates that importin α1 is stably associated with chromatin or chromatin-associated factors in MN, consistent with our additional biochemical and imaging data showing preferential association with euchromatin (e.g., H3K4me3) and chromatin regulators.

      Taken together, the FRAP data provide functional evidence that complements our structural and molecular analyses, supporting our central conclusion that importin α1 accumulation in MN defines a restricted chromatin environment that influences the accessibility of DNA repair and sensing factors.

      5) In Results (line 117), you state that "the cytoplasm of those cell lines emitted quite strong signals" for Importin α1, but that phrasing is a little confusing. Yes, Importin α1 is in present the cytoplasm in most cells, but it appears you are referring to the enrichment in MN. I would recommend re-phrasing this statement to make your intent clearer.

      As the reviewer rightly noted, the original phrasing, "the cytoplasm of those cell lines emitted quite strong signals," was misleading, as it could suggest a broad cytoplasmic distribution of importin α1. Our observations showed that importin α1 accumulated specifically in MN located within the cytoplasm, but not in the cytoplasmic regions. To clarify this, we revised the Results section (page 7, lines 125-127) to read: " Next, we performed indirect immunofluorescence (IF) analysis on human cancer cell lines, including MCF7 and HeLa cells. Notably, we found that importin α1 accumulated prominently in MN located within the cytoplasm (MCF7 cells, Fig. 1B; HeLa cells, Fig. 1C; yellow arrowhead). " .

      We believe that this revised wording more accurately reflects our findings and addresses the reviewer's concerns.

      6) In Results (line 135, Figure S2E,F), the ratio of high, low or no Importin α1 intensity is confusing. Is this percentage relative to the total number of MN? It Is unclear what is meant by "whole number" of MN. Is Importin α1 intensity quantified or is it subjective?

      We apologize for the confusing terminology used in the original manuscript for Supplemental Fig. S2 and thank the reviewer for pointing it out. Although the reviewer did not specifically comment on the classification of importin α1 signal intensity as "high" or "low," we recognized that this approach relied on subjective visual assessment and lacked clearly defined thresholds. To improve clarity and objectivity, we have removed this classification and now analyze importin α1 localization in MN as simply positive or negative (revised Supplemental Fig. S2E). The previous graph (original Fig. S2F) was deleted. In addition, the frequency of Importin α1-positive MN has been reported in the Results section of the main text (page 8). We believe that these revisions have improved the clarity and reproducibility of our data presentation.

      7) Figure 2C is confusing. Are you counting MN with co-localization of Importin α1 and these factors? Please clarify.

      Figure 2C shows the percentage of importin α1-positive MN that displayed localization of importin β1, CAS, or Ran based on IF analysis. In other words, it represents the co-localization rates of these transport factors specifically within the subset of MN positive for importin α1. To improve clarity, we revised the y-axis label in Fig. 2C to "Localization in Impα1-positive MN (%)" and modified the figure legend accordingly. We have clarified this point in the Results section (page 9). We believe that these revisions resolve the confusion and clarify the scope of the analysis.

      8) Figure S3D quantification is very confusing and unclear. Also, how is this normalized? Are you controlling for total signal in each cell? And can the results of this experiment give you any mechanistic insight as to what is regulating MN localization beyond the interpretation of "MN localization is distinct from PN localization"? The "C-mutant" appears quite a bit different than the others. What might that indicate about the role of CAS/CSE1L in MN enrichment?

      We apologize for the confusion caused by the quantification in the Supplemental Fig. S3D (now revised as Fig. S4D). This figure shows the relative enrichment of EGFP-importin α1 in MN compared with that in PN for wild-type and mutant constructs. To control for nuclear size, fluorescence intensity was measured using a fixed circular ROI (1.5-2.0 µm in diameter) placed in both the MN and PN of the same cell, and MN/PN intensity ratios were directly plotted for individual cells (n = 8 per condition). This procedure is described in detail in the Results section (page 10).

      Regarding the C-mutant, the reduced MN/PN ratio primarily reflects increased importin α1 accumulation in the PN rather than a reduced retention in the MN. As discussed in the revised manuscript (page 18), this suggests that CAS/CSE1L-mediated nuclear export is active in the PN but may be impaired or uncoupled in the MN, possibly due to differences in nuclear envelope integrity or chromatin context. We believe that this clarification addresses the reviewer's concerns and highlights the mechanistic implications of the C-mutant phenotype.

      9) For Figures 3A,B and S4, are these images of single z-slices or projections? It would be helpful to clarify for your interpretations as to whether they are truly partial or diffuse or the membrane is in another z-plane. Also, how does the localization of Importin α1 different or similar to other factors that localize to MN with a compromised nuclear envelope, such as cGAS? If it is based on epigenetic marks, it should be different than cGAS, which primarily binds non-chromatinized DNA.

      We thank the reviewer for this valuable suggestion. All images shown in Figs 3A, 3B, and S4 in the original manuscript (now revised as Fig. 4A and 4B, with the original Fig. S4 omitted) were derived from single optical sections rather than projections. We would like to emphasize that similar discontinuities in signals for lamin proteins (including laminB1 and laminA/C) were consistently observed across multiple cells and independent experiments, indicating that these observations are not due to an artifact of image acquisition or a missing z-plane, but rather reflect a genuine partial loss of the MN membrane.

      In contrast to cGAS, which predominantly binds non-chromatinized DNA in ruptured MN, our data indicate that importin α1 preferentially localizes to MN regions enriched in euchromatin-associated histone modifications, such as H3K4me3. The new data presented in Fig. 8 further strengthen this point by directly comparing importin α1 with DNA-recognizing proteins such as cGAS and RPA2, which preferentially localize to MN lacking importin α1. Together, these results highlight that importin α1-positive MN constitute a distinct subset characterized by chromatin-associated localization and reduced accessibility to DNA repair and sensing proteins.

      10) In Results, it is unclear how Fig. 7B was calculated. Are the authors qualitatively assessing if RAD51 is there or looking for MN enrichment relative to PN? Additionally, in Fig. 7C, RAD51 localization is diffuse. It should be enriched in foci. I would recommend the authors repeat this experiment using pre-extraction then quantify RAD51 foci number and/or intensity.

      For the quantification shown in Fig. 7B of the original manuscript, we acquired images containing approximately 15-50 cells per condition and counted all the micronuclei (MN) in those fields. The percentage of RAD51-positive MN relative to the total MN was calculated. In the revised manuscript, we further refined this analysis by classifying RAD51-positive MN into two categories based on signal intensity: weak (Cell #1 type) and strong (Cell #2 type). For each condition, nine independent fields were analyzed (302 MN in untreated cells and 213 MN in etoposide-treated cells). This quantification revealed that etoposide treatment preferentially increased the proportion of MN with strong RAD51 accumulation (Fig. 7C, right panels), indicating enhanced DNA damage in MN. Thus, our analysis was quantitative rather than qualitative, based on systematic counting across multiple fields.

      Regarding the reviewer's suggestion of pre-extraction, we believe that this approach is technically difficult because MN are structurally fragile. Importantly, in the subset of MN with strong RAD51 accumulation, RAD51 was clearly present in foci rather than diffuse signals, as shown in the high-magnification images (Fig. 7E).

      Finally, in response to Reviewer #1, we performed a new quantitative analysis (Fig. 7F) focusing on the frequency of strongly RAD51-positive MN in relation to importin α1 status. This analysis confirmed the mutually exclusive relationship between RAD51 and importin α1 in MN and further strengthened our conclusions.

      11) In line 264, "notably" is misspelled.

      Thank you for pointing this out. We have corrected the spelling.

      12) In line 303, "scenarios" should be changed to the singular form.

      Thank you for this confirmation. We have corrected this to "scenario".

      13) In Figure legend, line 571-582, H3K27me3 is shown in Figure 4D, but the written legend does not mention this mark.

      We have added the marks in the legend for Fig. 5E.


      Significance: Overall, this paper shows compelling evidence for micronuclear localization of regulators of nuclear export, notably Importin α1. Of note, this occurs in subsets of MN that lack an intact nuclear envelope. And while it has been appreciated that compromised micronuclear envelopes lead to genomic instability, this is one of the first that demonstrate alteration in the nuclear envelope may disrupt import or export of nuclear proteins into micronuclei.

      A limitation of the study is that much of the work is based on immunofluorescence and lacks mechanism. While there is much correlative data showing that Importin α1 localizes to micronuclei with compromised envelopes, it is unclear whether Importin α1 drives micronuclear collapse or it is downstream of this process. Additionally, Importin α1 micronuclear localization anti-correlates with RAD51 but does colocalize with other DNA replication factors, yet it is unclear whether their localization is dependent on Importin α1 or its role in nuclear export. Currently, the audience for this manuscript would be focused to those interested in micronuclei. If these concerns about an active role for Importin α1 in micronuclear export are resolved, it would greatly increase the impact of this manuscript to those interested more broadly in genomic instability, DNA repair, and cancer.

      We thank the reviewer for positively evaluating our study and highlighting the importance of defining the biological significance of our findings. In the revised manuscript, we incorporated new data (Fig. 8) demonstrating that importin α1-positive MN are mutually exclusive not only with RAD51 but also with RPA2 and cGAS. These results clearly establish importin α1-positive MN as a distinct subset, defined by the enrichment of chromatin-associated proteins, while being largely inaccessible to canonical DNA repair and DNA-sensing factors.

      Consistent with this, our FRAP experiments and analysis of the CAS/CSE1L-binding mutant (C-mut) further indicated that the recycling dynamics of importin α1 were altered in MN compared to PN. In addition, importin α1 was enriched in lamin-deficient areas of MN, where electron microscopy revealed a fragile nuclear envelope morphology. Together with prior evidence, as discussed in the revised manuscript that recombinant importin α can inhibit nuclear envelope assembly in Xenopus egg extracts (Hachet et al, 2004), these findings raise the possibility that high local concentrations of importin α1 may actively contribute to impaired nuclear envelope formation or stability in MN.

      Such a distinct MN state may have important biological consequences. By limiting the access of DNA repair and DNA-sensing proteins, importin α1 accumulation may influence chromothripsis and immune activation, which, in turn, could play a role in tumor progression and genome instability. We believe that the identification of importin α1 as a marker defining such a restricted MN environment represents a conceptual advance that extends the relevance of our study beyond the MN field to the broader areas of genome instability, DNA repair, and cancer biology. We are grateful to the reviewer for encouraging us to strengthen the framing of our work, which has helped us clarify the novelty and impact of our findings.

      Reviewer #3

      Summary:

      This study reports that importin alpha isoforms enrich strongly in a subset of micronuclei in cancer cells and uses mutagenesis and immunostaining to define how this localization relates to importin alpha's nuclear transport function. This enrichment occurs even though importin-alpha-positive micronuclei also contain Ran and the importin alpha export factor CSE1L, indicating that importin a enrichment is not simply a consequence of the absence of components of the nuclear transport machinery that control its localization. Mutagenesis of importin a indicates that Mn enrichment persists even when the importin beta binding and NLS binding capacities of imp a are impaired. Potential importin alpha interacting proteins are identified by proteomics, although the relationship of these potential binding partners to micronucleus localization is unclear.


      1. In Figure S3, the authors show that mutagenesis of importin alpha's CSE1L binding domain decreases the ratiometric enrichment in Mn vs. Pn. However, is this effect occurring because th CSE1L binding mutant decreases Mn enrichment, or increases Pn enrichment? It seems that the latter is possible based on the images shown. If the Pn specifically becomes brighter on average in cells expressing the C-mut, while Mn remain similar in fluorescence intensity, that might suggest that CSE1L has less of an effect on importin alpha export in Mn compared to Pn.

      We appreciate the reviewer's insightful observations. In the revised analysis (now presented in Supplemental Fig. S4D), we quantified EGFP-importin α1 intensities in both PN and MN using fixed circular regions of interest. This revealed that the reduced MN/PN ratio observed in the CSE1L-binding mutant (C-mut) was mainly due to an increase in the PN signal rather than a decrease in the MN signal. These results are consistent with the reviewer's suggestion and indicate that CSE1L-mediated nuclear export is functional in PN but has a limited impact on MN.

      Importantly, this interpretation is supported by our FRAP experiments (Fig. 3), which show that importin α1 recycles normally in the PN but exhibits markedly reduced mobility in the MN. Together with our proteomic and colocalization analyses (Fig. 6), which identified importin α1 association with chromatin regulators such as PARP1 and SUPT16H/FACT, these findings suggest that importin α1 accumulates in MN not only because the recycling machinery is uncoupled but also because it forms stable interactions with chromatin-associated proteins. As discussed in the revised manuscript, this dual mechanism provides a plausible explanation for the persistent retention of importin α1 in MN and its role in defining a distinct MN environment.

      It is unclear from the text or the methods whether RIME identification of importin-alpha binding partners is performed in reversine-treated cells, which would increase the proportion of importin alpha in Mn, or in untreated cells. In either case, it seems likely that the majority of interactors identified would be cargoes that rely on importin alpha for import into the Pn. The rationale for linking these potential interactions to the Mn is unclear. While some of these factors are indeed shown enriched in Mn in Figure 5, the significance of this is also unclear. These points should be clarified.

      We thank the reviewer for raising this important point. The RIME assay was performed using whole-cell extracts from untreated wild-type MCF7 cells, which primarily identified importin α1-associated nuclear cargo proteins. To assess their potential relevance to MN, we screened the RIME candidates using immunofluorescence data provided by the Human Protein Atlas database and experimentally validated those showing clear MN localization by colocalization with importin α1. This two-step approach enabled us to highlight importin α1 interactors that are functionally relevant to MN biology rather than general nuclear cargoes.

      In response to the reviewer's concerns, we revised the Results section to clarify this rationale. Specifically, we added the explanation that "As importin α1 interactors are typically nuclear proteins, it is plausible that they reside not only in the primary nucleus but also in the MN. To test this possibility, we screened the identified candidates for MN localization using immunofluorescence images provided by the Human Protein Atlas (HPA) database (Pontén et al, 2008; Thul et al, 2017)." (page 14, lines 294-297).

      This is consistent with the idea that a wide range of nuclear proteins carrying NLS motifs can recruit importin α1 into the micronuclei, where they reside. This protein-driven enrichment of importin α1 may create a restricted microenvironment in which canonical DNA repair and sensing proteins, including RAD51, RPA2, and cGAS, are excluded, thereby defining a distinct subset of micronuclei with limited genome surveillance capacity.

      In Figure 6, the authors perform FRAP of importin alpha in Mn and show that it recovers much more slowly in Mn than in Pn. However, it appears from the images shown that the entire Mn was photobleached in each FRAP experiment. It thus is unclear whether the slow FRAP recovery is limited by slow diffusion of importin alpha within Mn/on Mn chromatin or impaired trafficking of importin alpha into and out of Mn. These distinct outcomes have distinct implications: either importin alpha is immobilized on Mn (eu)chromatin, or alternatively importin alpha is poorly transported into / out of Mn. This ambiguity could be resolved by bleaching a portion of a Mn and testing whether importin alpha diffuses within a single Mn.

      We thank the reviewer for this insightful comment regarding the interpretation of FRAP data. As the reviewer rightly pointed out, the original FRAP design-where the entire MN was photobleached-does not allow for a clear discrimination between the intranuclear immobilization of importin α1 and impaired trafficking into or out of the MN.

      In line with a similar suggestion from Reviewer #1, we attempted partial photobleaching of MN to evaluate whether importin α1 can diffuse within MN independently of nucleocytoplasmic transport. However, due to the small size of MN, precise targeting is technically challenging and recovery is often unreliable, with some MN even exhibiting partial recovery during the bleaching process itself. These data were not included in the revised figures; however, we provide representative examples as reviewer-only figures to illustrate these technical limitations.

      To further clarify the nuclear transport dynamics of importin α1, we redesigned our FRAP experiments to fully photobleach both the PN and MN within the same cells under identical conditions. These results, presented in revised Fig. 3A and 3C, demonstrate a markedly slower recovery of importin α1 in MN compared to PN, strongly suggesting that nucleocytoplasmic recycling of importin α1 is impaired in MN. Moreover, the reduced mobility of importin α1 in the MN is consistent with stable chromatin binding, limiting its ability to diffuse freely within the nuclear space.

      We believe that this additional analysis, prompted by the reviewer's comment, significantly strengthens the mechanistic interpretation of our FRAP data.

      References

      Crasta K, Ganem NJ, Dagher R, Lantermann AB, Ivanova EV, Pan Y, Nezi L, Protopopov A, Chowdhury D, Pellman D (2012) DNA breaks and chromosome pulverization from errors in mitosis. Nature 482: 53-58

      Hachet V, Kocher T, Wilm M, Mattaj IW (2004) Importin α associates with membranes and participates in nuclear envelope assembly in vitro. EMBO J 23: 1526-1535

      Martinez-Olivera R, Datsi A, Stallkamp M, Köller M, Kohtz I, Pintea B, Gousias K (2018) Silencing of the nucleocytoplasmic shuttling protein karyopherin a2 promotes cell-cycle arrest and apoptosis in glioblastoma multiforme. Oncotarget 9: 33471-33481

      Vietri M, Schultz SW, Bellanger A, Jones CM, Petersen LI, Raiborg C, Skarpen E, Pedurupillay CRJ, Kjos I, Kip E, Timmer R, Jain A, Collas P, Knorr RL, Grellscheid SN, Kusumaatmaja H, Brech A, Micci F, Stenmark H, Campsteijn C (2020) Unrestrained ESCRT-III drives micronuclear catastrophe and chromosome fragmentation. Nat Cell Biol 22: 856-867

      Wang CI, Chien KY, Wang CL, Liu HP, Cheng CC, Chang YS, Yu JS, Yu CJ (2012) Quantitative proteomics reveals regulation of karyopherin subunit alpha-2 (KPNA2) and its potential novel cargo proteins in nonsmall cell lung cancer. Mol Cell Proteomics 11: 1105-1122

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study reports that importin alpha isoforms enrich strongly in a subset of micronuclei in cancer cells and uses mutagenesis and immunostaining to define how this localization relates to importin alpha's nuclear transport function. This enrichment occurs even though importin-alpha-positive micronuclei also contain Ran and the importin alpha export factor CSE1L, indicating that importin a enrichment is not simply a consequence of the absence of components of the nuclear transport machinery that control its localization. Mutagenesis of importin a indicates that Mn enrichment persists even when the importin beta binding and NLS binding capacities of imp a are impaired. Potential importin alpha interacting proteins are identified by proteomics, although the relationship of these potential binding partners to micronucleus localization is unclear.

      Significance

      1. In Figure S3, the authors show that mutagenesis of importin alpha's CSE1L binding domain decreases the ratiometric enrichment in Mn vs. Pn. However, is this effect occurring because th CSE1L binding mutant decreases Mn enrichment, or increases Pn enrichment? It seems that the latter is possible based on the images shown. If the Pn specifically becomes brighter on average in cells expressing the C-mut, while Mn remain similar in fluorescence intensity, that might suggest that CSE1L has less of an effect on importin alpha export in Mn compared to Pn.
      2. It is unclear from the text or the methods whether RIME identification of importin-alpha binding partners is performed in reversine-treated cells, which would increase the proportion of importin alpha in Mn, or in untreated cells. In either case, it seems likely that the majority of interactors identified would be cargoes that rely on importin alpha for import into the Pn. The rationale for linking these potential interactions to the Mn is unclear. While some of these factors are indeed shown enriched in Mn in Figure 5, the significance of this is also unclear. These points should be clarified.
      3. In Figure 6, the authors perform FRAP of importin alpha in Mn and show that it recovers much more slowly in Mn than in Pn. However, it appears from the images shown that the entire Mn was photobleached in each FRAP experiment. It thus is unclear whether the slow FRAP recovery is limited by slow diffusion of importin alpha within Mn/on Mn chromatin or impaired trafficking of importin alpha into and out of Mn. These distinct outcomes have distinct implications: either importin alpha is immobilized on Mn (eu)chromatin, or alternatively importin alpha is poorly transported into / out of Mn. This ambiguity could be resolved by bleaching a portion of a Mn and testing whether importin alpha diffuses within a single Mn.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors have shown that Importin α1, a nuclear transport factor, is enriched in subsets of micronuclei (MN) of cancer cells (MCF7 and HeLa) and, using FRAP, has an altered dynamics in MN. Moreover, the authors have shown that these levels of Importin α1 in the MN are likely not due to its traditional role for signal-dependent protein transport, as suggested by immunofluorescence of other factors important for this function. Additionally, cargo dynamics carrying NLS or NES signals were disrupted in Importin α1-positive micronuclei. Importin α1-positive micronuclei also appear to have a disrupted nuclear envelope, potentially explaining some of these cargo disruptions. The authors also demonstrated that Importin α colocalizes with proteins important for DNA replication, and p53 signaling using RIME, followed by immunofluorescence. Lastly, the authors show that Importin α and RAD51 have mutual exclusivity in the micronuclei.

      Major comments:

      1. A key issue is there are very few statistical tests used in this study. It is crucial to the interpretation of the data. We strongly urge the authors to re-analyze the data using appropriate statistical analyses. Along those lines, in many figures 1 or 2 images are shown without stating how many biological or technical replicates this is representative of or showing quantification of the anlyses. In general, the authors' statements would be strengthened by showing more examples and/or stating "N" in the figure legends or supplement.
      2. Using RIME and immunofluorescence, the authors identify factors that co-localize with Importin α1 in subsets of micronuclei (Figure 5), which is interesting, but there is no functional data associated with this result. Are the authors stating that these differences account for altered DNA damage or replication? It is unclear what the conclusion is beyond "some MN are different than others." Could the authors knockdown/knockout these factors to determine if they recruit Importin α1 into MN or the reciprocal? For many of these factors, they appear to be broadly present throughout the entire primary nucleus as well, indicating there is nothing unique about their MN localization.
      3. In line 274, the authors state that MN highly enriched for Importin α1 inhibits RAD51 accessibility but this is an overstatement of the data. Instead, the authors show that RAD51 and importin α1 do not colocalize in micronuclei, albeit without quantification which weakens their argument. Also, the consequence of this "mutual exclusivity" is unclear. Can the authors inhibit or knockdown Importin α1 and show that RAD51 goes to all micronuclei? And how is this different than the data shown for factors in Figure 5? Some of those show colocalization with Importin α1-positive micronuclei and others do not. Could you perform live imaging of labeled Importin a1 and RAD51 and show that as Importin α1 accumulates in MN that RAD51 or other DNA repair factors are exported? An alternative experiment would be to show that the C-mutant, which is defective in nuclear export, now colocalizes with RAD51 in MN. Please reconcile this or show experiments to prove the statement above.
      4. In the Discussion, line 343-344 states that "importin α1 is uniquely distributed and alters the nuclear/chromatin status when enriched in MN," however this is not currently supported by the present data. The data presented shows correlation (albeit weak) between euchromatic modifications and Importin α1, and it does not definitively show that importin α1 is sufficient to alter the nuclear-chromatin status when enriched in the MN. More substantial experiments would be required to show whether Importin α1 plays an active role in these modifications.

      Minor Comments

      1. Summary statement (page 3 Line 40): The use of "their" is confusing. Whose microenvironment are you referring to?
      2. In Abstract and introduction (page 4, Line 44 and page 5, line 59) it states that MN are membrane enclosed structures, but this is not always the case (see https://doi.org/10.1038/nature23449 as one example).
      3. Given the fact that the RIME result identified proteins involved in DNA replication to be enriched with Importin α1, are these MN enriched in factors described in Fig. 5 simply localizing to MN that are in S phase, as described previously (doi: 10.1038/nature10802)?
      4. The FRAP data is not very compelling. While it is clear there are differences between the PN and MN dynamics, what is driving these differences? Are these differences meaningful to the biology of the MN or PN? It is unclear what this data is contributing to the conclusions of the paper. Also, if the mobility of the MN is plotted on the same graph as the PN, the differences in MN mobility might not look as compelling.
      5. In Results (line 117), you state that "the cytoplasm of those cell lines emitted quite strong signals" for Importin α1, but that phrasing is a little confusing. Yes, Importin α1 is in present the cytoplasm in most cells, but it appears you are referring to the enrichment in MN. I would recommend re-phrasing this statement to make your intent clearer.
      6. In Results (line 135, Figure S2E,F), the ratio of high, low or no Importin α1 intensity is confusing. Is this percentage relative to the total number of MN? It Is unclear what is meant by "whole number" of MN. Is Importin α1 intensity quantified or is it subjective?
      7. Figure 2C is confusing. Are you counting MN with co-localization of Importin α1 and these factors? Please clarify.
      8. Figure S3D quantification is very confusing and unclear. Also, how is this normalized? Are you controlling for total signal in each cell? And can the results of this experiment give you any mechanistic insight as to what is regulating MN localization beyond the interpretation of "MN localization is distinct from PN localization"? The "C-mutant" appears quite a bit different than the others. What might that indicate about the role of CAS/CSE1L in MN enrichment?
      9. For Figures 3A,B and S4, are these images of single z-slices or projections? It would be helpful to clarify for your interpretations as to whether they are truly partial or diffuse or the membrane is in another z-plane. Also, how does the localization of Importin α1 different or similar to other factors that localize to MN with a compromised nuclear envelope, such as cGAS? If it is based on epigenetic marks, it should be different than cGAS, which primarily binds non-chromatinized DNA.
      10. In Results, it is unclear how Fig. 7B was calculated. Are the authors qualitatively assessing if RAD51 is there or looking for MN enrichment relative to PN? Additionally, in Fig. 7C, RAD51 localization is diffuse. It should be enriched in foci. I would recommend the authors repeat this experiment using pre-extraction then quantify RAD51 foci number and/or intensity.
      11. In line 264, "notably" is misspelled.
      12. In line 303, "scenarios" should be changed to the singular form.
      13. In Figure legend, line 571-582, H3K27me3 is shown in Figure 4D, but the written legend does not mention this mark.

      Significance

      Overall, this paper shows compelling evidence for micronuclear localization of regulators of nuclear export, notably Importin α1. Of note, this occurs in subsets of MN that lack an intact nuclear envelope. And while it has been appreciated that compromised micronuclear envelopes lead to genomic instability, this is one of the first that demonstrate alteration in the nuclear envelope may disrupt import or export of nuclear proteins into micronuclei.

      A limitation of the study is that much of the work is based on immunofluorescence and lacks mechanism. While there is much correlative data showing that Importin α1 localizes to micronuclei with compromised envelopes, it is unclear whether Importin α1 drives micronuclear collapse or it is downstream of this process. Additionally, Importin α1 micronuclear localization anti-correlates with RAD51 but does colocalize with other DNA replication factors, yet it is unclear whether their localization is dependent on Importin α1 or its role in nuclear export. Currently, the audience for this manuscript would be focused to those interested in micronuclei. If these concerns about an active role for Importin α1 in micronuclear export are resolved, it would greatly increase the impact of this manuscript to those interested more broadly in genomic instability, DNA repair, and cancer.

      Reviewer's areas of expertise: Genomic instability, cancer epigenetics, and mitosis

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.

      Major comments:

      Are the key conclusions convincing?

      The conclusions drawn by the authors would benefit from additional supportive experiments and a more detailed explanation. 1. It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells 2. While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells. 3. The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889). 4. Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells. The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN. To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      Are the data and the methods presented in such a way that they can be reproduced?

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Are the experiments adequately replicated and statistical analysis adequate?

      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      Minor comments:

      Specific experimental issues that are easily addressable.

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      Are prior studies referenced appropriately?

      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      Are the text and figures clear and accurate?

      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      1. In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.
      2. In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.
      3. The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.
      4. Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.
      5. In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.
      6. For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.
      7. To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.
      8. The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.
      9. Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small
      10. Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.
      • Place the work in the context of the existing literature (provide references, where appropriate).

      Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways. The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A). While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking. Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability. - State what audience might be interested in and influenced by the reported findings.

      Scientist and health care professionals that research on mechanism of genomic instability and cancer - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Mitosis, mitotic chromatin decondensation, nuclear reformation, hematopoietic cancers, light microscopy, image analysis.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction * Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      TAS predictions derived only from insect-stage RNA-seq data because in a previous study it was shown that there are no significant differences between stages in the 5’UTR procesing in T. cruzi life stages (https://doi.org/10.3389/fgene.2020.00166) We are not testing an additional transcriptome here, because the robustness of the software was already probed in the original article were UTRme was described (Radio S, 2018 doi:10.3389/fgene.2018.00671).

      Results - "There is a distinctive average nucleosome arrangement at the TASs in TriTryps": * You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.

      The reviewer has a good point. We made our statement based on the value of the maximum peak of the sequenced DNA molecules, which in general is a good indicative of the extension of the digestion achieved by the sample (Cole H, NAR, 2011).

      As the reviewer correctly points, we should have also considered the length of the DNA molecules in each percentile. However, in this case both, T. brucei’s and L major’s samples were gel purified before sequencing and it is hard to know exactly what fragments were left behind in each case. Therefore, it is better not to over conclude on that regard.

      We have now comment on this in the main manuscript, and we have clarified in the figure legends which data set we used in each case.

      * It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm.

      The replicates used for the construction of each figure are explicitly indicated in Table S1. Although we have detailed in the table the original publication, the project and accession number for each data set, the reviewer is correct that in this case it was still not completely clear to which length distribution heatmap was each sample associated with. To avoid this confusion, we have now added the accession number for each data set to the figure legends and also clarified in Table S1. Regarding the reviewer’s comment on the correspondence between the observed TAS protection and the extent of samples digestion, he/she is correct that for a more digested sample we would expect a clearer NDR. In this case, the difference in the extent of digestion between these two samples is minor, as observed the length of the main peak in the length distribution histogram for sequenced DNA molecules is the same. These two samples GSM5363006, represented in Fig1 b, and GSM5363007, represented in S2, belong to the same original paper (Maree et al 2017), and both were gel purified before sequencing. Therefore, any difference between them could not only be the result of a minor difference in the digestion level achieved in each experiment but could be also biased by the fragments included or not during gel purification. Therefore, I would not over conclude about TAS protection from this comparison. We have now included a brief comment on this, in the figure discussion

      * The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      We appreciate the reviewer suggestion. We cannot assure if it is due to technical or biological reasons, but there is evidence that L. major ‘s genome has a different dinucleotide content and it might have an impact on nucleosome assembly. We have now added a comment about this observation in the final discussion of the manuscript.

      Results - "An MNase sensitive complex occupies the TASs in T. brucei": * The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fixed time point adding increasing amounts of MNase. However, even when making controlled experimental timepoints, you need to check the length distribution histogram of sequenced DNA molecules to be sure which level of digestion you have achieved.

      In this particular case, we used public available data sets to make this analysis. We made an arbitrary definition of low, intermediate and high level of digestion, not as an absolute level of digestion, but as a comparative output among the tested samples. We based our definition on the comparison of __the main peak in length distribution heatmaps because this parameter is the best metric to estimate the level of digestion of a given sample. It represents the percentage of the total DNA sequenced that contains the predominant length in the sample tested. __Hence, we considered:

      low digestion: when the main peak is longer than the expected protection for a nucleosome (longer than 150 bp). We expect this sample to contain additional longer bands that correspond to less digested material.

      intermediate digestion, when the main peak is the expected for the nucleosome core-protection (˜146-150bp).

      high digestion, when the main peak is shorter than that (shorter than 146 bp). This case, is normally accompanied by a bigger dispersion in fragment sizes.

      To do this analysis, we chose samples that render different MNase protection of the TAS when plotting all the sequenced DNA molecules relative to this point and we used this protection as a predictor of the extent of sample digestion (Figure 2). To corroborate our hypothesis, that the degree of TAS protection was indeed related to the extent of the MNase digestion of a given sample, we looked at the length distribution histogram of the sequenced DNA molecules in each case. It is the best measurement of the extent of the digestion achieved, especially, when sequencing the whole sample without any gel purification and representing all the reads in the analysis as we did. The only caveat is with the sample called “intermediate digestion 1” that belongs to the original work of Mareé 2017, since only this data set was gel purified.

      Whether the sample used in Figure 1 (from Mareé 2017) is also from the same lab and is an MNase-seq. Strictly speaking, there is no methodological difference between MNase-seq and the input of a native MNase-ChIP-seq, since the input does not undergo the IP.

      * Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.

      The sharp cutoff is neither due to gel purification or bioinformatic filtering, it is just due to the length of the paired-end read used in each case. In earlier works the most common was to sequence only 50bp, with the improvement of technologies it went up to 75,100 or 125 bp. We have now clarified in Table S1 the length of the paired-reads used in each case when possible.

      * Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly.

      As explained above, it's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme, which has a preference for AT reach sequences.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would be to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always get some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well, originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, or by containing a poor AT sequence content, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, you end up observing a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or over digested samples. Our main point, is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA.

      Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones": * The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.

      What we learned from other eukaryotic organisms that were deeply studied, such as yeast, is that NDRs are normally generated at regulatory points in the genome. In this sense, yeast tRNA genes have a complex with a bootprint smaller than a nucleosome formed by TFIIIC-TFIIB (Nagarajavel, doi: 10.1093/nar/gkt611). On the other hand, many promotor regions have an MNase-sensitive complex with a nucleosome-size footprint, but it does not contain histones (Chereji, et al 2017, doi:10.1016/j.molcel.2016.12.009). The reviewer is right that from Figure 1 and S2 we could observe that the footprint of whatever occupies the TAS region, especially in T. brucei, is nucleosome-size. However, it only shows the size, but it doesn’t prove the nature of its components. Nevertheless, those are only MNase-seq data sets. Since it does not include a precipitation with specific antibodies, we cannot confirm the protecting complex is made up by histones. In parallel, a complementary study by Wedel 2017, from Siegel’s lab, shows that using a properly digested sample and further immunoprecipitating with a-H3 antibody, the TAS is not protected by nucleosomes at least not when analyzing nucleosome size-DNA molecules. Besides, Briggs et. al 2018 (doi: 10.1093/nar/gky928) showed that at least at intergenic regions H3 occupancy goes down while R-loops accumulation increases. We have now added a supplemental figure associated to Figure 3 (new Suplemental 5) replotting R-loops and MNase-ChIP-seq for H3 relative to our predicted TAS showing this anti-correlation and how it partly correlates with MNase protection as well. As a control we show that Rpb9 trends resembles H3 as Siegel’s lab have shown in Wedel 2018.

      * Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.

      In most of our analysis we used real replicated experiments. Such is the case MNase-seq data used in Figure 1, with the corresponding replicate experiments used in Figure S2; T. cruzi MNase-ChIP-seq data used in Figure 3b and 4a with the respective replicate used in Figures S4 and S5 (now S6 in the revised manuscript). The only case in which we used experiments coming from two different laboratories, is in the case of MNase-ChIP-seq for H3 from T. brucei. Unfortunately, there are only two public data sets coming each of them from different laboratories. The samples used in Fig 3 (from Siegel’s lab) whether the IP from H3 represented in S4 and S5 (S6 n the updated version) comes from another lab (Patterton’s). To be more rigorous, we now call them data 1 and 2 when comparing these particular case.

      The reviewer is right that in this particular case one is native chromatin (Pattertons’) while the other one is crosslinked (Siegel’s). We have now clarified it in the main text that unfortunately we do not count on a replicate but even under both condition the result remains the same, and this is compatible with my own experience, were crosslinking does not affect the global nucleosome patterns (compared nucleosome organization from crosslinked chromatin MNAse-seq inputs Chereji, Mol Cell, 2017 doi: 10.1016/j.molcel.2016.12.009 and native MNase-seq from Ocampo, NAR, 2016 doi: 10.1093/nar/gkw068).

      * Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff.

      We have only filtered adapter dimmer or overrepresented sequences when needed. In Figures 2 and S3 we represented all the sequenced reads. In other figures when we sort fragments sizes in silico, such as nucleosome range, dinucleosome or subnucleosome size, we make a note in the figure legends. What the reviewer points is related to the length of the sequence DNA fragment in each experiment. As we explained above, the older data-sets were performed with 50 bp paired-end reads, the newer ones are 75, 100 or 125bp. This is information is now clarified in Table S1.

      __Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes": __

      __ __* Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.

      We have shown chromatin organization for T. brucei in S5b to show that there is a similar trend. Unfortunately, we did not get a robust list of multi-copy genes for T. brucei as we did get for T. cruzi, therefore we do not want to over conclude showing the RNA-seq for these subsets of genes. The limitation is related to the fact that UTRme restrict the search and is extremely strict when calling sites at repetitive regions.

      * Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.

      The mapping of occurrence and annotations that belong to repetitive regions has great complexity. UTRme is specially designed to avoid overcalling those sites. In other words, there is a chance that we could be underestimating the number of predicted TASs at multi-copy genes. Regarding the impact on chromatin analysis, we cannot rule out that it might have an impact, but the observation favors our conclusion, since even when some TASs at multi-copy genes can remain elusive, we observe more nucleosome density at those places.

      * The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.

      We have fixed this now in the preliminary revised version

      * How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      This classification was done the same way it was explained for T. cruzi

      Genomes and annotations: * If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      The most appropriate way to analyze high throughput data, is to aline it to the same genome were the experiments were conducted. This was clearly illustrated in a previous publication from our group were we explained how should be analyzed data from the hybrid CL Brener strain. A common practice in the past was to use only Esmeraldo-like genome for simplicity, but this resulted in output artifacts. Therefore, we aligned it to CL Brener genome, and then focused the main analysis on the Esmeraldo haplotype (Beati Plos ONE, 2023). Ideally, we should have counted on transcriptomic data for the same strain (CL Brener or Esmeraldo). Since this was not the case at that moment, we used data from Y strain that belongs to the same DTU with Esmeraldo.

      In the case of T. brucei, when we started our analysis and the software code for UTRme was written, the previous version of the genome was available. Upon 2018 version came up, we checked chromatin parameters and observed that it did not change the main observations. Therefore, we continue working with our previous setups.

      Reproducibility and broader integration: * Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.

      We are preparing a full pipeline in GitHub. We will make it available before manuscript full revision

      * As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims.

      We are now including a new suplemental figure including DRIP-seq and Rp9 ChIP-seq (revised S5). Additionally, we added a new panel c to figure 4, representing FAIRE-seq data for T. cruzi fore single and multi-copy genes

      We are working on ATAC-seq analysis and BSF MNase-seq

      Optional analyses that would strengthen the study: * Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      We have now included a panel in suplemental figure 5 (now revised S6), showing the concordance for chromatin organization of stratified genes by RNA-seq levels relative to TAS.

      __Minor / editorial comments: __ * In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.

      We have clarified this in the preliminary revised version

      * Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.

      The dotted line is just to indicate where the maximum peak is located. It is now clarified in figure legends.

      * In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.

      We have now fixed the figure. Thanks for noticing this mistake.

      * Typo in the Introduction: "remodellingremodeling" → "remodeling

      Thanks for noticing this mistake, it is fixed in the current version of the manuscript

      **Referee cross-commenting** Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Reviewer #1 (Significance (Required)):

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation. The significance lies in three aspects: 1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing. 2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids. 3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Siri et al. perform a comparative analysis using publicly available MNase-seq data from three trypanosomatids (T. brucei, T. cruzi, and Leishmania), showing that a similar chromatin profile is observed at TAS (trans-splicing acceptor site) regions. The original studies had already demonstrated that the nucleosome profile at TAS differs from the rest of the genome; however, this work fills an important gap in the literature by providing the most reliable cross-species comparison of nucleosome profiles among the tritryps. To achieve this, the authors applied the same computational analysis pipeline and carefully evaluated MNase digestion levels, which are known to influence nucleosome profiling outcomes.

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. The manuscript could be improved with some clarifications and adjustments:

      1. The authors state from the beginning that available MNase data indicate altered nucleosome occupancy around the TAS. However, they could also emphasize that the conclusions across the different trypanosomatids are inconsistent and even contradictory: NDR in T. cruzi versus protection-in different locations-in T. brucei and Leishmania.

      We start our manuscript by referring to the first MNase-seq data sets publicly available for each TriTryp and we point that one of the main observations, in each of them, is the occurrence of a change in nucleosome density or occupancy at intergenic regions. In T. cruzi, in a previous publication from our group, we stablished that this intergenic drop in nucleosome density occurs near the trans-splicing acceptor site. In this work, we extend our study to the other members of TriTryps: T. brucei and L. major.

      In T. brucei the papers from Patterton’s lab and Siegel’s lab came out almost simultaneously in 2017. Hence, they do not comment on each other’s work. The first one claims the presence of a well-positioned nucleosome at the TAS by using MNase-seq, while the second one, shows an NDR at the TAS by using MNase-ChIP-seq. However, we do not think they are contradictory, or they have inconsistency. We brought them together along the manuscript because we think these works can provide complementary information.

      On one hand, we infer data from Pattertons lab is slightly less digested than the sample from Siegel’s lab. Therefore, we discuss that this moderate digestion must be the reason why they managed to detect an MNase protecting complex sitting at the TAS (Figure 1). On the other hand, Sigel’s lab includes an additional step by performing MNase-ChIP-seq, showing that when analyzing nucleosome size fragments, histones are not detected at the TAS. Here, we go further in this analysis on figure 3, showing that only when looking at subnucleosome-size fragments, we are able to detect histone H3. And this is also true for T. cruzi.

      By integrating every analysis in this work and the previous ones, we propose that TASs are protected by an MNase-sensitive complex (probed in Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). To be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs, 2018 doi: 10.1093/nar/gky928) and that R-loops have plenty of interacting proteins (Girasol, 2023 10.1093/nar/gkad836). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules, possibly involved in trans-splicing. We have now added a new figure S5 showing R-loop co-localization with the NDR.

      Regarding the comparison between different organisms, after explaining the sensitivity to MNase of the TAS protecting complex, we discuss that when comparing equally digested samples T. cruzi and T. brucei display a similar chromatin landscape with a mild NDR at the TAS (See T. cruzi represented in Figure 1 compared to T. brucei represented in Intermediate digestion 2 in Figure 2, intermediate digestion in the revised manuscript). Unfortunately, we cannot make a good comparison with L. major, since we do not count on a similar level of digestion.

      Another point that requires clarification concerns what the authors mean in the introduction and discussion when they write that trypanosomes have "...poorly organized chromatin with nucleosomes that are not strikingly positioned or phased." On the other hand, they also cite evidence of organization: "...well-positioned nucleosome at the spliced-out region.. in Leishmania (ref 34)"; "...a well-positioned nucleosome at the TASs for internal genes (ref37)"; "...a nucleosome depletion was observed upstream of every gene (ref 35)." Aren't these examples of organized chromatin with at least a few phased nucleosomes? In addition, in ref 37, figure 4 shows at least two (possibly three to four) nucleosomes that appear phased. In my opinion, the authors should first define more precisely what they mean by "poorly organized chromatin" and clarify that this interpretation does not contradict the findings highlighted in the cited literature.

      For a better understanding of nucleosome positioning and phasing I recommend the review: Clark 2010 doi:10.1080/073911010010524945, Figure 4. Briefly, in a cell population there are different alternative positions that a given nucleosome can adopt. However, some are more favorable. When talking about favorable positions, we refer to the coordinates in the genome that are most likely covered by a nucleosome and are predominant in the cell population. Additionally, nucleosomes could be phased or not. This refers not only the position in the genome, but to the distance relative to a given point. In yeast, or in highly transcribed genes of more complex eukaryotes, nucleosomes are regularly spaced and phased relative to the transcription start site (TSS) or to the +1 nucleosome (Ocampo, NAR, 2016, doi:10.1093/nar/gkw068). In trypanosomes, nucleosomes have some regular distribution when making a browser inspection but, given that they are not properly phased with respect to any point, it is almost impossible to make a spacing estimation from paired-end data. This is also consistent with a chromatin that is transcribed in an almost constitutive manner.

      As the reviewer mention, we do site evidence of organization. We think the original observations are correct, but we do not fully agree with some of the original statements. In this manuscript our aim is to take the best we learned from their original works and to make a constructive contribution adding to the original discussions. In this regard, in trypanosomes there are some conserved patterns in the chromatin landscape, but their nucleosomes are far from being well-positioned or phased. For a better understanding, compare the variations observed in the y axis when representing av. nucleosome occupancy in yeast with those observed in trypanosomes and you will see that the troughs and peaks are much more prominent in yeast than the ones observed in any TryTryp member.

      Following the reviewer’s suggestion we have now clarified this in the main text

      The paper would also benefit from the inclusion of a schematic figure to help readers visualize and better understand the findings. What is the biological impact of having nucleosomes, di-nucleosomes, or sub-nucleosomes at TAS? This is not obvious to readers outside the chromatin field. For example, the following statement is not intuitive: "We observed that, when analyzing nucleosome-size (120-180 bp) DNA molecules or longer fragments (180-300 bp), the TASs of either T. cruzi or T. brucei are mostly nucleosome-depleted. However, when representing fragments smaller than a nucleosome-size (50-120 bp) some histone protection is unmasked (Fig. 3 and Fig. S4). This observation suggests that the MNase sensitive complex sitting at the TASs is at least partly composed of histones." Please clarify.

      We appreciate the reviewer’s suggestion to make a schematic figure. We are working on this and will be added to the manuscript upon final revision.

      Regarding the biological impact of having mono, di or subnucleosome fragments, it is important to unveil the fragment size of the protected DNA to infer the nature of the protecting complex. In the case of tRNA genes in yeast, at pol III promoters they found footprints smaller than a nucleosome size that ended up being TFIIB-TFIIC (Nagarajavel, doi: 10.1093/nar/gkt611). Therefore, detecting something smaller than a nucleosome might suggest the binding of trans-acting factors different than histones or involving histones in a mixed complex. These mixed complexes are also observed, and that is the case of the centromeric nucleosome which has a very peculiar composition (Ocampo and Clark, Cells Reports, 2015). On the other hand, if instead we detect bigger fragments, it could be indicative of the presence of bigger protecting molecules or that those regions are part of higher order chromatin organization still inaccessible for MNase linker digestions.

      Here we show on 2Dplots, that complex or components protecting the TAS have nucleosome size, but we cannot assure they are entirely made up by histones, since, only when looking at subnucleosome-size fragments, we are able to detect histone H3. We have now added part of this explanation to the discussion.

      By integrating every analysis in this work and the previous ones, we propose that the TAS is protected by an MNase-sensitive complex (Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). As explained above, to be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs 2018) and that R-loops have plenty of interacting proteins (Girasol, 2023). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules. We have now added a new S5 figure showing R-loop co-localization.

      Some references are missing or incorrect:

      we will make a thorough revision

      "In trypanosomes, there are no canonical promoter regions." - please check Cordon-Obras et al. (Navarro's group). Thank you for the appropiate suggestion.

      We have now added this reference

      Please, cite the study by Wedel et al. (Siegel's group), which also performed MNase-seq analysis in T. brucei.

      We understand that reviewer number 2# missed that we cited this reference and that we did used the raw data from the manuscript of Wedel et. al 2017 form Siegel’s group. We used the MNase-ChIP-seq data set of histone H3 in our analysis for Figures 3, S4b and S5b (S6c in the revised version), also detailed in table S1. To be even more explicit we have now included the accession number of each data set in the figure legend.

      Figure-specific comments: Fig. S3: Why does the number of larger fragments increase with greater MNase digestion? Shouldn't the opposite be expected?

      This a good observation. As we also explained to reviewer#1:

      It's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always have some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, there you end up having a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or overdigested samples. Our main point is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA Fig. S5B: Why not use MNase conditions under which T. cruzi and T. brucei display comparable profiles at TAS? This would facilitate interpretation.

      The reviewer made a reasonable observation. The reason why we used MNase-ChIP_seq instead of just MNase to test occupancy at TAS at the subsets of genes, is because we intended to be more certain if we were talking about the presence of histones or something else. By using IP for histone H3 we can see that at multi-copy genes this protein is present when looking at nucleosome-size fragments. Additionally, as shown in figure S4b, length distribution histograms are also similar for the compared IPs.

      Minor points:

      There are several typos throughout the manuscript.

      Thanks for the observation. We will check carefully.

      Methods: "Dinucelotide frecuency calculation."

      We will add a code in GitHub

      Reviewer #2 (Significance (Required)):

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. Audience: basic science and specialized readers.

      Expertise: epigenetics and gene expression in trypanosomatids.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      The authors analysed publicly accessible MNase-seq data in TriTryps parasites, focusing on the chromatin structure around trans-splicing acceptor sites (TASs), which are vital for processing gene transcripts. They describe a mild nucleosome depletion at the TAS of T. cruzi and L. major, whereas a histone-containing complex protects the TASs of T. brucei. In the subsequent analysis of T. brucei, they suggest that a Mnase-sensitive complex is localised at the TASs. For single-copy versus multi-copy genes, the authors show different di-nucleotide patterns and chromatin structures. Accordingly, they propose this difference could be a novel mechanism to ensure the accuracy of trans-splicing in these parasites.

      Before providing an in- depth review of the manuscript, I note that some missing information would have helped in assessing the study more thoroughly; however, in the light of the available information, I provide the following comments for consideration.

      The numbering of the figures, including the figure legends, is missing in the PDF file. This is essential for assessing the provided information.

      We apologized for not including the figure numbers in the main text, although they are located in the right place when called in the text. The omission was unwillingly made when figure legends were moved to the bottom of the main text. This is now fixed in the updated version of the manuscript.

      The publicly available Mnase- seq data are manyfold, with multiple datasets available for T. cruzi, for example. It is unclear from the manuscript which dataset was used for which figure. This must be clarified.

      This was detailed in Table S1. We have now replaced the table by an improved version, and we have also included the accession number of each data set used in the figure legends.

      Why do the authors start in figure 1 with the description of an MNase- protected TAS for T.brucei, given that it has been clearly shown by the Siegel lab that there is a nucleosome depletion similar to other parasites?

      We did not want to ignore the paper from Patterton’s lab because it was the first one to map nucleosomes genome-wide in T. brucei and the main finding of that paper claimed the existence of a well-positioned nucleosome at intergenic regions, what we though constitutes a point worth to be discussed. While Patterton’s work use MNase-seq from gel-purified samples and provides replicated experiments sequenced in really good depth; Siegel’s lab uses MNase-ChIP-seq of histone H3 but performs only one experiment and its input was not sequenced. So, each work has its own caveats and provides different information that together contributes to make a more comprehensive study. We think that bringing up both data sets to the discussion, as we have done in Figures 1 and 3, helps us and the community working in the field to enrich the discussion.

      If the authors re- analyse the data, they should compare their pipeline to those used in the other studies, highlighting differences and potential improvements.

      We are working on this point. We will provide a more detail description in the final revision.

      Since many figures resemble those in already published studies, there seems little reason to repeat and compare without a detailed comparison of the pipelines and their differences.

      Following the reviewer advice, we are now working on highlighting the main differences that justify analyzing the data the way we did and will be added in the finally revised method section.

      At a first glance, some of the figures might look similar when looking at the original manuscripts comparing with ours. However, with a careful and detailed reading of our manuscripts you can notice that we have added several analyses that allow to unveil information that was not disclosed before.

      First, we perform a systematic comparison analyzing every data set the same way from beginning to end, being the main difference with previous studies the thorough and precise prediction of TAS for the three organisms. Second, we represent the average chromatin organization relative to those predicted TASs for TriTryps and discuss their global patterns. Third, by representing the average chromatin into heatmaps, we show for the very first time, that those average nucleosome landscape are not just an average, they keep a similar organization in most of the genome. These was not done in any of the previous manuscripts except for our own (Beati, PLOS One 2023). Additionally, we introduce the discussion of how the extension of MNase reaction can affect the output of these experiments and we show 2D-plots and length distribution heatmaps to discuss this point (a point completely ignored in all the chromatin literature for trypanosomes). Furthermore, we made a far-reaching analysis by considering the contributions of each publish work even when addressed by different techniques. Finally, we discuss our findings in the context of a topic of current interest in the field, such as TriTryp’s genome compartmentalization.

      Several previous Mnase- seq analysis studies addressing chromatin accessibility emphasized the importance of using varying degrees of chromatin digestion, from low to high digestion (30496478, 38959309, 27151365).

      The reviewer is correct, and this point is exactly what we intended to illustrate in figure number 2. We appreciate he/she suggests these references that we are now citing in the final discussion. Just to clarify, using varying degrees of chromatin digestion is useful to make conclusions about a given organism but when comparing samples, strains, histone marks, etc. It is extremely important to do it upon selection of similar digested samples.

      No information on the extent of DNA hydrolysis is provided in the original Mnase- seq studies. This key information can not be inferred from the length distribution of the sequenced reads.

      The reviewer is correct that “No information on the extent of DNA hydrolysis is provided in the original Mnase-seq studies” and this is another reason why our analysis is so important to be published and discussed by the scientific community working in trypanosomes. We disagree with the reviewer in the second statement, since the level of digestion of a sequenced sample is actually tested by representing the length distribution of the total DNA sequenced. It is true that before sequencing you can, and should, check the level of digestion of the purified samples in an agarose gel and/or in a bioanalyzer. It could be also tested after library preparation, but before sequencing, expecting to observe the samples sizes incremented in size by the addition of the library adapters. But, the final test of success when working with MNase digested samples is to analyze length of DNA molecules by representing the histograms with length distribution of the sequenced DNA molecules. Remarkably, on occasions different samples might look very similar when run in a gel, but they render different length distribution histograms and this is because the nucleosome core could be intact but they might have suffered a differential trimming of the linker DNA associated to it or even be chewed inside (see Cole Hope 2011, section 5.2, doi: 10.1016/B978-0-12-391938-0.00006-9, for a detailed explanation).

      As the input material are selected, in part gel- purified mono- nucleosomal DNA bands. Furthermore the datasets are not directly comparable, as some use native MNase, while others employ MNase after crosslinking; some involve short digestion times at 37 {degree sign} C, while others involve longer digestion at lower temperatures. Combining these datasets to support the idea of an MNase- sensitive complex at the TAS of T. brucei therefore may not be appropriate, and additional experiments using consistent methodologies would strengthen the study's conclusions.

      In my opinion, describing an MNase- sensitive complex based solely on these data is not feasible. It requires specifically designed experiments using a consistent method and well- defined MNase digestion kinetics.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fix time point adding increasing amounts of MNase. However, the information obtained from the detail analysis of the length distribution histogram of sequenced DNA molecules the best test of the real outcome. In fact, those samples with different digestion levels were probably not generated on purpose.

      The only data sets that were gel purified are those from Mareé 2017 (Patterton’s lab), used in Figures 1, S1 and S2 and those from L. major shown in Fig 1. It was a common practice during those years, then we learned that is not necessary to gel purify, since we can sort fragment sizes later in silico when needed.

      As we explained to reviewer #1, to avoid this conflict, we decided to remove this data from figures 2 and S3. In summary, the 3 remaining samples comes from the same lab, and belong to the same publication (Mareé 2022). These sample are the inputs of native MNase ChIp-seq, obtain the same way, totally comparable among each other.

      Reviewer #3 (Significance (Required)):

      Due to the lack of controlled MNase digestion, use of heterogeneous datasets, and absence of benchmarking against previous studies, the conclusions regarding MNase-sensitive complexes and their functional significance remain speculative. With standardized MNase digestion and clearly annotated datasets, this study could provide a valuable contribution to understanding chromatin regulation in TriTryps parasites.

      As we have explained in the previous point our conclusions are valid since we do not compare in any figure samples coming from different treatments. The only exception to this comment could be in figure 3 when talking about MNase-ChIP-seq. We have now added a clear and explicit comment in the section and the discussion that despite having subtle differences in experimental procedures we arrive to the same results. This is the case for T. cruzi IP, run from crosslinked chromatin, compared to T. brucei’s IP, run from native chromatin.

      Along the years it was observed in the chromatin field that nucleosomes are so tightly bound to DNA that crosslinking is not necessary. However, it is still a common practice specially when performing IPs. In our own hands, we did not observe any difference at the global level neither in T. cruzi or in my previous work with yeast.

      ...

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors analysed publicly accessible MNase-seq data in TriTryps parasites, focusing on the chromatin structure around trans-splicing acceptor sites (TASs), which are vital for processing gene transcripts. They describe a mild nucleosome depletion at the TAS of T. cruzi and L. major, whereas a histone-containing complex protects the TASs of T. brucei. In the subsequent analysis of T. brucei, they suggest that a Mnase-sensitive complex is localised at the TASs. For single-copy versus multi-copy genes, the authors show different di-nucleotide patterns and chromatin structures. Accordingly, they propose this difference could be a novel mechanism to ensure the accuracy of trans-splicing in these parasites.

      Before providing an in- depth review of the manuscript, I note that some missing information would have helped in assessing the study more thoroughly; however, in the light of the available information, I provide the following comments for consideration.

      The numbering of the figures, including the figure legends, is missing in the PDF file. This is essential for assessing the provided information. The publicly available Mnase- seq data are manyfold, with multiple datasets available for T. cruzi, for example. It is unclear from the manuscript which dataset was used for which figure. This must be clarified. Why do the authors start in figure 1 with the description of an MNase- protected TAS for T.brucei, given that it has been clearly shown by the Siegel lab that there is a nucleosome depletion similar to other parasites? If the authors re- analyse the data, they should compare their pipeline to those used in the other studies, highlighting differences and potential improvements. Since many figures resemble those in already published studies, there seems little reason to repeat and compare without a detailed comparison of the pipelines and their differences. Several previous Mnase- seq analysis studies addressing chromatin accessibility emphasised the importance of using varying degrees of chromatin digestion, from low to high digestion (30496478, 38959309, 27151365). No information on the extent of DNA hydrolysis is provided in the original Mnase- seq studies. This key information can not be inferred from the length distribution of the sequenced reads. As the input material are selected, in part gel- purified mono- nucleosomal DNA bands. Furthermore the datasets are not directly comparable, as some use native MNase, while others employ MNase after crosslinking; some involve short digestion times at 37 {degree sign} C, while others involve longer digestion at lower temperatures. Combining these datasets to support the idea of an MNase- sensitive complex at the TAS of T. brucei therefore may not be appropriate, and additional experiments using consistent methodologies would strengthen the study's conclusions. In my opinion, describing an MNase- sensitive complex based solely on these data is not feasible. It requires specifically designed experiments using a consistent method and well- defined MNase digestion kinetics.

      Significance

      Due to the lack of controlled MNase digestion, use of heterogeneous datasets, and absence of benchmarking against previous studies, the conclusions regarding MNase-sensitive complexes and their functional significance remain speculative. With standardized MNase digestion and clearly annotated datasets, this study could provide a valuable contribution to understanding chromatin regulation in TriTryps parasites.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Siri et al. perform a comparative analysis using publicly available MNase-seq data from three trypanosomatids (T. brucei, T. cruzi, and Leishmania), showing that a similar chromatin profile is observed at TAS (trans-splicing acceptor site) regions. The original studies had already demonstrated that the nucleosome profile at TAS differs from the rest of the genome; however, this work fills an important gap in the literature by providing the most reliable cross-species comparison of nucleosome profiles among the tritryps. To achieve this, the authors applied the same computational analysis pipeline and carefully evaluated MNase digestion levels, which are known to influence nucleosome profiling outcomes.

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. The manuscript could be improved with some clarifications and adjustments:

      1. The authors state from the beginning that available MNase data indicate altered nucleosome occupancy around the TAS. However, they could also emphasize that the conclusions across the different trypanosomatids are inconsistent and even contradictory: NDR in T. cruzi versus protection-in different locations-in T. brucei and Leishmania.
      2. Another point that requires clarification concerns what the authors mean in the introduction and discussion when they write that trypanosomes have "...poorly organized chromatin with nucleosomes that are not strikingly positioned or phased." On the other hand, they also cite evidence of organization: "...well-positioned nucleosome at the spliced-out region.. in Leishmania (ref 34)"; "...a well-positioned nucleosome at the TASs for internal genes (ref37)"; "...a nucleosome depletion was observed upstream of every gene (ref 35)." Aren't these examples of organized chromatin with at least a few phased nucleosomes? In addition, in ref 37, figure 4 shows at least two (possibly three to four) nucleosomes that appear phased. In my opinion, the authors should first define more precisely what they mean by "poorly organized chromatin" and clarify that this interpretation does not contradict the findings highlighted in the cited literature.
      3. The paper would also benefit from the inclusion of a schematic figure to help readers visualize and better understand the findings. What is the biological impact of having nucleosomes, di-nucleosomes, or sub-nucleosomes at TAS? This is not obvious to readers outside the chromatin field. For example, the following statement is not intuitive: "We observed that, when analyzing nucleosome-size (120-180 bp) DNA molecules or longer fragments (180-300 bp), the TASs of either T. cruzi or T. brucei are mostly nucleosome-depleted. However, when representing fragments smaller than a nucleosome-size (50-120 bp) some histone protection is unmasked (Fig. 3 and Fig. S4). This observation suggests that the MNase sensitive complex sitting at the TASs is at least partly composed of histones." Please clarify. Some references are missing or incorrect:

      "In trypanosomes, there are no canonical promoter regions." - please check Cordon-Obras et al. (Navarro's group).

      Please, cite the study by Wedel et al. (Siegel's group), which also performed MNase-seq analysis in T. brucei.

      Figure-specific comments:

      Fig. S3: Why does the number of larger fragments increase with greater MNase digestion? Shouldn't the opposite be expected?

      Fig. S5B: Why not use MNase conditions under which T. cruzi and T. brucei display comparable profiles at TAS? This would facilitate interpretation.

      Minor points:

      There are several typos throughout the manuscript.

      Methods: "Dinucelotide frecuency calculation."

      Significance

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms.

      Audience: basic science and specialized readers.

      Expertise: epigenetics and gene expression in trypanosomatids.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction:

      • Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      Results

      • "There is a distinctive average nucleosome arrangement at the TASs in TriTryps":
      • You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.
      • It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm. The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      Results

      • "An MNase sensitive complex occupies the TASs in T. brucei":
      • The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.
      • Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.
      • Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly. Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones":
      • The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.
      • Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.
      • Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff. Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes":
      • Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.
      • Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.
      • The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.
      • How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      Genomes and annotations:

      • If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      Reproducibility and broader integration:

      • Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.
      • As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims. Optional analyses that would strengthen the study:
      • Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      Minor / editorial comments:

      • In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.
      • Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.
      • In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.
      • Typo in the Introduction: "remodellingremodeling" → "remodeling."

      Referee cross-commenting

      Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Significance

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation.

      The significance lies in three aspects:

      1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing.
      2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids.
      3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.


      Reply to the Reviewers

      We thank the reviewers for their positive assessments overall and for many helpful suggestions for clarification to make the manuscript more accessible to a broader audience. We made minor text changes and added more labels to the figures to address these comments.

      • *

      __Referee #1

      __

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments:

      1. while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data.

      Response

      • We agree that identifying the relevant cargo(s) will be key to understanding the detailed mechanisms involved and that the lack of such information is a limitation of our study. However, the impact of our study is to show that these lipid transporters functionally interact to affect aECM organization, a role that could be relevant to many systems, including humans.

      As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics.

      Response

      • This would be an interesting future experiment but is outside our current technical capabilities.

      The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein).

      Response

      • The reviewer is right to point out that lipid variations could occur at different levels, and that we should exercise caution. However, the unsupervised lipidomics analysis would have detected not only individual lipid variations, but also variations in the total or subgroup lipid content. Indeed, the eggs were weighed prior to extraction and each sample was extracted with the same precise volume of solvent before analysis. Furthermore, the LC-MS/MS injection sequence included blanks and quality control (QC) samples. The blanks were the extraction solvent, which allowed us to control for features unrelated to the biological samples. The QC sample was a mixture of all the samples included in the injection sequence, reflecting the central values of the model. If a subclass of samples, such as the lpr-1 mutant, had been characterized by a decrease in one lipid, a subgroup of lipids, or all lipids, it would have clustered separately. Instead, our PCA showed that the variation between samples of the same genotype (wild type, lpr-1 mutant, or lpr-1; scav-2) was similar to the variation between samples from two different genotypes. This means that we did not detect modifications to lipid quantity specifically or in total. A figure illustrating the lipid contents would show no difference between groups.

      Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information

      Response

      • All of the fluorescent signal shown in this figure panel corresponds to the indicated LPR fusion - no other labelling method was used. SfGFP::LPR-3 labels the matrix structures (alae and annuli) as well as some puncta – the ratio of matrix to puncta changes over developmental stages. We edited the figure legend to make this more clear.

      One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function?

      Response

      • Suppression of lpr-1 (or other aECM mutant) lethality is the only known phenotype caused by loss of scav-2 Therefore, this is the only phenotype for which we can do a rescue experiment to test functionality of the knock-in. The data presented do indicate that the knock-in fusion retains significant function.

      In general, the data is clearly presented and the statistical analyses look sound.

      Response

      • Thank you

      __Minor comments: __

      Please provide page and line numbers!

      Response:

      • done

      Avoid contractions like "don't" in both text and figure legends

      Response:

      • changed one instance of “don’t” to “do not”

      Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background"

      Response:

      • Wording changed to “This transgene caused very little lethality in a wild-type background (Fig. 6C), indicating it is not generally toxic.”

      Figure 7: what is meant with "Dodt"?

      Response:

      • Dodt gradient contrast imaging is a method for transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the Dodt label from Figure 7 since it seems to be confusing and it is not really important whether the brightfield image is DIC or Dodt.

        Reviewer #1 (Significance (Required)):

        The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors. My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

      __Referee #2 __ 1. The manuscript is very well written; the documentation is fine, but some more details are needed for better following the subject for readers not familiar with nematode anatomy.

      For instance, while alae are somehow explained, annuli are not - structures that look abnormal in lpr1 and lpr1-scav2 mutants (Fig. 5B).

      Response

      • Apologies for this oversight. We added annuli labels to Figure 1 and Figure 5 panels and added descriptions of annuli to the Figure 1 legend and the Results text.

      Moreover, the authors show in Fig. 1 the punctae etc in the epidermis, whereas in Fig. 2 the show Lpr3 accumulation or not in the duct and the pore (lpr1). How do they localize in the cells of these structures at high magnification? It is also important to see the Lpr3 localisation in lpr1 mutants shown in Fig. 2A with the quality of the images shown in Fig. 1F. This applies also to Figs. 4 and 5.

      Responses:

      • The embryonic duct and pore cells are very small and we have not reliably seen puncta within them. In Figs 2 and 5, we supplemented the duct and pore images with those from the epidermis, which is a much larger tissue, allowing us to resolve puncta and matrix structures with better resolution.
      • The laser settings in Figs 2,4,5 (as opposed to Fig. 1) were chosen to avoid saturation of the matrix signal so that we could do accurate quantifications as shown. The images are unmodified with respect to brightness and therefore appear relatively dim – but we think they convey the observations very accurately.

      I would like to see punctae in lpr1-scav2 doubles.

      Response:

      • Puncta in this genotype are shown for the epidermis in Figure 5. It has not been possible to see puncta specifically within the embryonic duct and pore.

      Regarding the central mechanism, one possibility is - what the authors describe - that Lpr1 is needed for Lpr3 accumulation in ducts and tubes. Alternatively, Lpr1 is needed for duct and tube expansion, in lack of which Lpr3 is unable to reach its destination that is the lumina. Scav2, in this scenario, might be antagonist of tube and duct expansion, and thereby rescue the Lpr1 mutant phenotype independently. Admittedly, the non-accumulation of Lpr3 in scav2 mutants argues against a lpr1-independent function of scav2.

      Responses:

      • LPR-1 is indeed needed to maintain duct and pore tube integrity as the tubes grow, but in mutants the tubes appear to collapse at a later stage than we imaged here (Stone et al 2009). The ~normal accumulation of LET-4 and LET-653 further argues that the duct and pore tubes are still intact at the 1.5-to-2-fold stages. Therefore, we conclude that the defect in LPR-3 accumulation precedes duct and pore collapse.
      • The changes we document in the epidermis also show that the lpr-1 mutant affects LPR-3 accumulation in another (non-tube) tissue.

      In any case, to underline the aspect of Lpr1-Scav2 dosage relationship, the authors may also have a look at Lpr3 distribution in lpr1 heterozygous, and lpr1-scav2 double heterozygous worms. In this spirit, it would be interesting to see the semi-dominant effects of scav2 on Lpr3 localisation in lpr1 mutants by microscopy.

      Response:

      • Because of the hermaphroditism of C. elegans, it would be technically challenging to confidently identify heterozygous (vs. homozygous) embryos for confocal imaging. We do not think that the results would be informative enough to warrant the effort, given that we’ve already shown that scav-2 heterozygosity can partly suppress lpr-1 The expectation is that LPR-3 levels would be partially restored in the scav-2 het, but it might take a very large sample size to confidently assess that partial effect.

      One word to the overexpression studies: it is surprising that the amounts of Scav2 delivered by the expression through the grl-2 promoter in the lpr1, scav2 background are almost matching those by the opposite effect of scav2 mutations on lpr1 dysfunction.

      Response:

      • The reviewer refers to the transgenic rescue experiment with the grl-2pro::SCAV-2 transgene. Because the scav-2 mutant phenotype being tested is suppression of lpr-1 lethality, the expected result from scav-2 rescue is to restore the lpr-1 lethal phenotype to the strain. This is exactly the result we see. We have revised the text to more clearly explain the logic.

      One issue concerns the localization of scav2-gfp "rarely" in vesicles: what are these vesicles?

      Response

      • Only a handful of vesicles were seen across all the images we collected, and we have not yet identified them. They could be associated with either SCAV-2 delivery or removal from the plasma membrane, as now stated in the text. SCAV-2 trafficking would be an interesting area for further study but is beyond the scope of this paper.

      One comment to the Let653 transgenes/knock-ins: the localization of transgenic Let653-gfp may be normal in lpr1 mutants because there are wild-type copies in the background.

      Response

      • There are wild type copies of LET-653 in the background, but no wild type copies of LPR-1. Even if the untagged LET-653 would be recruiting the tagged LET-653 as the reviewer suggests, we can still conclude that lpr-1 loss does not prevent the untagged LET-653 (and thus also the tagged LET-653) from accumulating in the duct lumen matrix.

      One thought to the model: if Scav2 has a function in a lpr1 background, this means that yet another transporter X delivers the substrate for Scav2, isn't it?

      Response

      • Yes, we completely agree with this interpretation and have revised the discussion and Figure 8 legend to more explicitly make this point.

      A word to the term haploinsifficient that is used in this study: scav2 mutants would be haploinsifficient if the heterozygous worms died in an otherwise wild-type background.

      Response

      • We disagree with this comment. The term “haploinsufficient” simply means that heterozygosity for a deletion or other loss of function allele can cause a mutant phenotype – the term is not restricted to lethal phenotypes.

        Reviewer #2 (Significance (Required)):

        Alexandra C.Belfi and colleagues wrote the manuscript entitled "Opposing roles for lipocalins and a CD36 family scavenger receptor in apical extracellular matrix-dependent protection of narrow tube integrity" in which they report on their findings on the genetic and cell-biological interaction between the lipid transporters Lpr1 and scav2 in the nematode C. elegans. In principle, these two proteins are involved in shaping the apical extracellular matrix (aECM) of ducts by regulating the amounts of Lpr3 in the extracellular space. While seems to act cell autonomously, Lpr1 has a non-cell autonomous effect on Lpr3.


      __Referee #3 __ Summary: Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      __*Major comments:

      *__

      1. The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Response

      • Thank you for these positive comments

        __Minor comments: __2) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      Response

      • Wording was changed to “duct/pore-specific suppression”

        3) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      Response

      • Done

        4) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      Response

      • Everything changed to LIMP2.

        5) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      Response

      • These experiments are now described on page 11.

        6) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      Response

      • The genetics indicate that lpr-1 and scav-2 have opposite effects on tube shaping and LPR-3 localization, so they do function antagonistically rather than collectively/cooperatively; we decided to keep this terminology.

        7) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      Response

      • This is a good point and the possibility is now mentioned in the Results page 9. We also changed our wording in the Abstract and Discussion to acknowledge the possibility that LPR-3 could be the SCAV-2 cargo, though we still don’t favor this model.

        8) Figure legend 1. I did not see an asterisk in figure 1B.

      Response

      • thanks for catching this error, text removed

        9) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      Response

      • We added an explanation to the figure legend.

        10) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      Response

      • Cuticle auto fluorescence is generally quite dim in L4s with our settings, and it was not an issue at this mid/late L4 stage, which corresponds to when both LPR fusions are at their brightest. Note that both large panels are MAX projections and yet you can’t see any cuticle auto-fluorescence in the LPR-1 panel.

        11) Fig 2 and others. Please define error bars.

      Response

      • These correspond to the standard deviation; this information is now added to the Methods.

        12) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      Response

      • The lpr-1 scav-2 strain is certainly not improved over lpr-1 but we have not noted any consistent worsening of the phenotype either.

        13) Consider defining Dodt in the first figure legend where it appears.

      Response

      • Dodt gradient contrast imaging is a method of transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the term from Figure 7 since it seems to be confusing.

        14) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      Response

      • We now include the 2nd Manders value in the figure legend and note that value is much lower (0.25) because much of the red signal is lysosomes (where green would be quenched by acidity).

        15) Consider referring to specific panels (A, B...) within references to the supplemental files.

      Response

      • done

        16) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      Response

      • fixed

        **Referees cross-commenting**

        I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

        Reviewer #3 (Significance (Required)):

        Significance: The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

        As for all my reviews, this is signed by David Fay.

      • *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      Major comments:

      The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Minor comments:

      1) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      2) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      3) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      4) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      5) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      6) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      7) Figure legend 1. I did not see an asterisk in figure 1B.

      8) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      9) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      10) Fig 2 and others. Please define error bars.

      11) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      12) Consider defining Dodt in the first figure legend where it appears.

      13) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      14) Consider referring to specific panels (A, B...) within references to the supplemental files.

      15) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      As for all my reviews, this is signed by David Fay.

      Referees cross-commenting

      I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

      Significance

      Significance:

      The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript is very well written; the documentation is fine, but some more details are needed for better following the subject for readers not familiar with nematode anatomy. For instance, while alae are somehow explained, annuli are not - structures that look abnormal in lpr1 and lpr1-scav2 mutants (Fig. 5B). Moreover, the authors show in Fig. 1 the punctae etc in the epidermis, whereas in Fig. 2 the show Lpr3 accumulation or not in the duct and the pore (lpr1). How do they localize in the cells of these structures at high magnification? It is also important to see the Lpr3 localisation in lpr1 mutants shown in Fig. 2A with the quality of the images shown in Fig. 1F. This applies also to Figs. 4 and 5. I would like to see punctae in lpr1-scav2 doubles. Regarding the central mechanism, one possibility is - what the authors describe - that Lpr1 is needed for Lpr3 accumulation in ducts and tubes. Alternatively, Lpr1 is needed for duct and tube expansion, in lack of which Lpr3 is unable to reach its destination that is the lumina. Scav2, in this scenario, might be antagonist of tube and duct expansion, and thereby rescue the Lpr1 mutant phenotype independently. Admittedly, the non-accumulation of Lpr3 in scav2 mutants argues against a lpr1-independent function of scav2. In any case, to underline the aspect of Lpr1-Scav2 dosage relationship, the authors may also have a look at Lpr3 distribution in lpr1 heterozygous, and lpr1-scav2 double heterozygous worms. In this spirit, it would be interesting to see the semi-dominant effects of scav2 on Lpr3 localisation in lpr1 mutants by microscopy. One word to the overexpression studies: it is surprising that the amounts of Scav2 delivered by the expression through the grl-2 promoter in the lpr1, scav2 background are almost matching those by the opposite effect of scav2 mutations on lpr1 dysfunction.

      One issue concerns the localization of scav2-gfp "rarely" in vesicles: what are these vesicles?

      One comment to the Let653 transgenes/knock-ins: the localization of transgenic Let653-gfp may be normal in lpr1 mutants because there are wild-type copies in the background.

      One thought to the model: if Scav2 has a function in a lpr1 background, this means that yet another transporter X delivers the substrate for Scav2, isn't it?

      A word to the term haploinsifficient that is used in this study: scav2 mutants would be haploinsifficient if the heterozygous worms died in an otherwise wild-type background.

      Significance

      Alexandra C.Belfi and colleagues wrote the manuscript entitled "Opposing roles for lipocalins and a CD36 family scavenger receptor in apical extracellular matrix-dependent protection of narrow tube integrity" in which they report on their findings on the genetic and cell-biological interaction between the lipid transporters Lpr1 and scav2 in the nematode C. elegans. In principle, these two proteins are involved in shaping the apical extracellular matrix (aECM) of ducts by regulating the amounts of Lpr3 in the extracellular space. While seems to act cell autonomously, Lpr1 has a non-cell autonomous effect on Lpr3.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments: while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data. As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics. The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein). Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function? In general, the data is clearly presented and the statistical analyses look sound.

      Minor comments: Please provide page and line numbers! Avoid contractions like "don't" in both text and figure legends Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background" Figure 7: what is meant with "Dodt"?

      Significance

      The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors.

      My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _Below we address all the comments by the reviewers. However, the figures that were used in our response are unfortunately not displayed in this format. _

      Reviewer #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments: 1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html

      We apologize for the citation issue. This citation by Liu et al , 2024 (18) was a preprint from BioRxiv. This manuscript is now published in Nature Biotechnology. The reference has been updated in the revised version of the manuscript. The reference number in revised manuscript is Liu et al, 2025 (23).

      In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned in the early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.

      We apologize and acknowledge the impact of the citation issue on this point. In Liu et al (2025), we have provided a comparison between our approach and the log-ratio strategy. We also agree that additional context was needed within the current study. Hence, we have now included more detailed information about the TE calculations in the initial results section (line 94).

      As noted by the reviewer, several other methods have been developed previously for measuring changes in translation efficiency. These methods are designed to be used in cases of paired designs where there is a treatment or manipulation that is assayed along with controls. While these methods are highly valuable in assessing differential TE, they are unable to accommodate the type of meta-analyses described in our study. In particular, we do not report changes/differential TE with respect to a control sample but instead focus on the coordinated patterns of TE across experiments. We now note this important distinction in the manuscript in the discussion section (line 494).

      The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.

      We thank the reviewer for this constructive suggestion. To the best of our knowledge, no prior study in humans or mice has systematically analyzed translational buffering across a wide range of conditions. As a result, defining a gold-standard set for benchmarking is currently not feasible.

      While packages such as anota2seq have proven highly valuable for identifying buffering effects in controlled experimental designs (e.g., comparing a treatment to a matched control), they are not readily applicable to the type of large-scale meta-analysis we present here.Our study integrates ribosome profiling and RNA-seq data across diverse datasets and conditions, which lies outside the design scope of such tools.

      The most relevant point of comparison to our work is Wang et al. 2020 Nature, which examined a related but distinct form of translational buffering across species for a given tissue. We now present the overlap of genes identified as buffered in our study vs Wang et al. 2020. The details are presented in the reviewer's comment 5-2.

      The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)

      Thanks for these suggestions. We agree that the threshold used to define TB high and low are somewhat subjective. We ensure that changing this cutoff as suggested is easily achievable with the provided R script. These can be used to reproduce all of the reported analyses of translational buffering with different cutoffs.

      To further assess whether our conclusions are robust to the selection of these thresholds, we tested several different values to define the TB high and TB low groups. As an example, we show here that the effect on protein variation and association of intrinsic features like the UTR lengths with the buffering potential of genes for different thresholds (i.e. if the TB high = top 100 or TB high = top 200) remain similar to the current cutoff of 250. However, if we increase the cutoff of TB high to 2000 and TB low to top 2000-4000 , the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D).Our analyses reveal that highly ranked genes show associations with particular features, indicating an underlying hierarchy in translational buffering potential. This point is now discussed in the manuscript (line 177).

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      In response to the reviewer's suggestion of presenting data using numerical quantitation, we incorporated several additional inclusions in the manuscript.

      1. We now report association of CDS / UTR length with translational buffering as a function of their translational buffering rank with highly ranked genes showing associations with particular features, indicating an underlying hierarchy in translational buffering potential (Sup Fig 3 A-B) Ii. We now include scatter plots which show that highly ranked genes have lower variation at the protein level in both cancer cell line and primary tissues (Sup Fig. 6 A-C).

      Iii. We have now carried out modified GO enrichment analyses. Specifically, Gene Ontology enrichment analysis was performed for the TB high genes in humans and mouse using the clusterProfiler R package. Lists of TB high genes in human or mouse were analyzed against the Gene Ontology (GO) database using the enrichGO() function, with the organism-specific annotation database (org.Hs.eg.db for human or org.Mm.eg.db for mouse) as reference. Gene identifiers were supplied as gene symbols, and all genes in the current study were used as the background universe. Enrichment was carried out for the Biological Process (BP) ontology, with significance assessed by the hypergeometric test. P-values were adjusted for multiple testing using the Benjamini–Hochberg method, and terms with an adjusted p-value Legend: Gene Ontology (GO) enrichment analysis of the TB high gene set, performed with the clusterProfiler R package. Enriched GO Biological Process terms are shown after redundancy reduction using clusterProfiler::simplify. Each dot represents a GO term, with dot size indicating the number of genes associated with the term and color reflecting the adjusted p-value (Benjamini–Hochberg correction). Only the top non-redundant terms are displayed.

      • *

      Additionally, we performed Gene set enrichment analysis using the list of genes ordered according to their RNA-TE correlation. Hence lower ranks have lower RNA-TE correlations. The GSEA plots show significantly enriched Gene Ontology Biological Process (GO:BP) terms at the lower ranks of the ordered gene list. Together, these analyses further emphasize the observation that genes involved in macromolecular complexes are translationally buffered.

      • *

      Legend: Curves represent the enrichment score (ES) across the ranked gene list, with vertical bars indicating the positions of pathway-associated genes. The enrichment was identified using the gseGO() function from clusterProfiler.

      Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      We thank the reviewer for the suggestions and now have been incorporated in the revised manuscript, accordingly.

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level: DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      We agree that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level for some genes. This point has now been revised in the introduction. We have incorporated all the suggested literature into the revised manuscript (line 38).

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      We thank the reviewer for this suggestion. We have now cited the recommended study in the revised manuscript (line 65). Here, we provide a comparison of its findings with ours. While this related work offers important insights into translational buffering, its focus is on buffering across species within a given tissue, whereas our study emphasizes buffering across conditions, cell types, and treatments within a species. Despite this difference in focus, the comparison is highly informative, and we now highlight both the similarities and distinctions between the two studies in the relevant section of the revised manuscript.

      Wang et al. calculate the variation at the transcriptome level vs at the translatome level and is represented as delta ∆ value for each gene. A lower value represents lower variation at the ribosome occupancy level than at the mRNA levels across various species. We classified the genes in the Wang et al study as TB high, TB low genes or others as identified in the current study while indicating the calculated delta ∆ from Wang et al. Many of the genes with a lower delta value (are delta ∆ Legend: A. Dot plot to highlight the delta value of all genes in the Wang et al study (also present in RiboBase) which are further grouped as TB high, low or others in (A) brain and (B) liver.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4

      We have added these reviews at the appropriate location of the manuscript.

      1. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs.

      We have now annotated many of the relevant graphs with p-values to facilitate visual interpretation, adding them where space and figure design allow.

      Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range.

      We appreciate the reviewer’s suggestion regarding the experiment to determine the buffering range.To understand this for multiple genes, we attempted a series of knockdowns using CRISPR/gRNA approach using a MutiCas12a approach. We targeted 8 buffered and 2 non-buffered genes using a 10-plex crRNA along with 10-plex gRNA serving as a negative control (Figure below). The fold change at the mRNA level of the targeted gene was within the variation range observed in replicates for other non-targeted genes. The challenge in performing a gradual knockdown is the subtle changes in RNA expression falls within the margin of error of estimation, making it difficult to understand the clear implications of the mRNA levels on buffering. Hence, the precise experimental manipulation of mRNA expression levels that would be conducive to translational buffering remains highly technically challenging. As noted in our manuscript (Figure 4D), the conventional approaches for manipulation of transcript abundance lead to larger changes than typically observed as a result of natural variation.

      *Legend: Validation of translational buffering by targeted knockdown of genes. A. The scatter plot shows the coefficient of variation of mRNA and ribosome occupancy between HEK293T cells targeted with sgRNA of different efficiencies. The genes indicated in blue are buffered and those in green are non buffered genes. B. The plot shows the fold change in mRNA abundance and ribosome occupancy as compared to cells that were infected with non-targeting crRNA array control (ratio of cpm in test vs control). Each color represents a gene and each point of a gene represents cells targeted by one of the four CRISPR arrays. *

      "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      We agree with the reviewer that according to the 'differential transcript accessibility model,' transcripts with abundances below a certain threshold should be completely accessible to the translational pool. Further, this could also be true for the other model, wherein initiation rate cannot increase beyond a particular threshold for transcripts of very low abundance. However, our observation from our haploinsufficiency analysis (Figure 4 B& C) and siRNA knockdown analysis from RiboBase (Figure 4 D) suggests that buffering might be possible within a given range of transcript abundance. Testing the buffering range by serial knockdowns might help in determining the threshold at which transcripts exhibit buffering. However, due to the challenges of serial knockdown as discussed above, makes this analysis difficult with Ribosome profiling and matched RNA-seq approach. An alternative approach could involve imaging translating and non-translating mRNA of buffered genes in different cells, which may help distinguish the two models. However, this falls outside the scope of the manuscript.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.

      We agree and apologize for this issue. The axes of the figures have been annotated appropriately to indicate the presence of outliers in the figures.

      1. There are several typos or weird sentences. Here are some (but maybe not all): 2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...] 2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples 2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here. 2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point.

      The necessary corrections have been incorporated in the revised version of the manuscript.

      1. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before.

      The main reason is the limited sample of the lymphoblastoid cell line data. In our larger analyses, we could use median absolute deviation as a robust metric of dispersion across heterogeneous samples. However, given the smaller dataset in that study we decided CV would be a better indicator of dispersion. To evaluate the potential for translational buffering of genes from RiboBase, we used two metrics. The first was the negative correlation between translation efficiency and RNA abundance across samples. The second metric relied on the ratio of variation in ribosome occupancy to variation in RNA levels. Given the limited sample size of the lymphoblastoid cell line dataset, we used the coefficient of variation (CV) instead of the median absolute deviation (MAD), as the data in this study were normalized using counts per million (CPM) rather than the centered log-ratio (clr) normalization used in RiboBase. This CV ratio allowed us to assess the effect of natural variation in RNA abundance on ribosome occupancy.

      1. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here?

      We agree with the reviewer that genes that show translational buffering may not conform to linear relationships between the two parameters. However, the proportion of genes exhibiting this buffering effect is not expected to significantly influence the overall regression fit. Instead, we hypothesized that low quality samples or truly different relationships between the two parameters can make this relationship nonlinear, rendering it unsuitable for linear regression analysis for calculation of TE.

      To address these possibilities, we first analysed a commonly used proxy for data quality. Given the characteristic movement of ribosomes across mRNAs, periodicity of sequencing reads is a useful metric to assess whether reads are randomly fragmented, as in RNA-seq, or specifically represent ribosome-protected footprints. For this, we compared two groups: samples that were removed (~30) and those retained for analysis. We plotted the distribution of periodicity scores for all samples in both groups. For the calculation of periodicity scores, first the percentage of reads mapped to the dominant frame position across the dynamic ribosome footprint read length range was calculated for each sample. The periodicity score was calculated by taking the weighted sum of these dominant percentages, with weights based on the total read counts at each length.

      The results indicate that the removed samples did not have lower periodicity scores, suggesting that their quality in terms of periodicity was comparable to the retained samples.

              To assess the second possibility, we checked if the study involved major perturbations, which may skew the relationship towards non linearity. The 30 samples that were removed came from 14 unique studies, 18 of which involved perturbation which possibly affected either of the two parameters. In addition to the genetic/pharmacological perturbations specific to the study, the overall conditions of the cells during an experiment could influence this relationship. Another point to note is that many of the filtered-out samples are HeLa and HEK293T cells, which show a normal relationship between ribosome occupancy and RNA abundance for the majority of cases.
      
              These considerations suggest that removing these samples is most appropriate, as their inclusion could bias the TE calculations.
      

      For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends.

      The haploinsufficiency and triplosensitivity analyses are now supported by a chi-squared test. The details of the statistical test are now mentioned in the text and the p-values have been noted on the respective figures.

      In Figure 2A, the "all genes" color doesn't correspond to the point color.

      The color in the figure has been modified in the revised version of the manuscript.

      1. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910

      We would like to thank the reviewer for their suggestion. The references have been incorporated in the revised version of the manuscript. We have now explained why codon usage could be a contributor in determining the translational buffering potential (line 190).

      "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq.

      We thank the reviewer for the suggestion. We reanalyzed the selected studies using edgeR and the modified figure is included in the revised version of the manuscript (Figure 4D). The conclusion after this analysis remains essentially the same. In particular, translational buffering is ineffective when mRNA abundance is perturbed drastically. Additionally, the limited number of experiments with direct perturbation of buffered genes limit the generalizability of this observation. This limitation is included in the result section (line 342).

      Legend: Scatter plot represents log2 fold change in RNA abundance and ribosome occupancy. Each point represents a gene and the fold change in its RNA and ribosome occupancy with respect to their controls. The line represents the line of equivalence. Buffered genes do not show less change in ribosome occupancy upon reduction in their RNA levels than other genes.

      1. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained.

      We thank the reviewer for pointing out the lack of clarity in the sentence. We have now quantitatively measured the CAI in the three categories and modified the sentence to better explain the rationale in the revised version (line 183). “To understand if codon usage patterns are associated with translational buffering, we next analyzed codon properties across buffered and non-buffered human gene sets. The codon adaptation index quantifies how closely a gene’s codon usage aligns with that of highly expressed genes. Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set. Specifically, 28.4% of TB high genes, 14% of TB low genes and 9.3% of genes in the other category fall within the top decile (>90th percentile) of codon adaptation index.”

      The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low.

      Each point and line are associated with a single gene. This is now clarified in the legend of the figure (line 364). The number of genes in this analysis is limited to the available ribosome profiling data with gene knockdown experiments.

      1. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained.

      While this appears to be a seeming contradiction, it is in line with what we expected. In particular, the objective of Figure 2J is to illustrate the features that predict the mRNA–TE correlation of genes, as identified using a LGBM model. The Spearman correlation shown reflects the relationship between each feature and the mRNA–TE correlation values. A negative correlation for codons such as GGU (Gly), AAG (Lys), and ACU (Thr) suggests that enrichment of these codons is associated with lower mRNA–TE correlation. This is in agreement with our observation in Figure 2E which suggests that high TB genes are enriched in these codons. In contrast, transcript size exhibits a positive correlation, indicating that shorter transcripts tend to have lower mRNA–TE correlation values.

      Given that the choice of colors is a potential source of confusion, we have revised the text (line 230) and the figure (& legend) to try to clarify this relationship.

      The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading.

      We agree and revised the subtitle to “The association of translationally buffered genes with the translational machinery varies in response to changes in mRNA abundance"

      1. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      This section has been rewritten in the revised version of the manuscript. The text now reads as

      “We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the “differential transcript accessibility model”, mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the “initiation rate model”, the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, as mRNA abundance increases, translation initiation on each transcript is reduced, thereby lowering the number of ribosomes per transcript. However, this mechanism allows a proportional increase in transcripts entering the translational pool for buffered genes, similar to non-buffered genes”

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      We thank the reviewer for noting the significance of the work and for their constructive feedback.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering.

      We thank the reviewer for their positive assessment and thoughtful suggestions that we address below.

      Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds.

      We agree that the threshold used to define TB high and TB low is somewhat subjective, and we now clearly acknowledge this in the discussion section (line 485). We now provide an R script that reproduces all analyses of translational buffering, where changing this cutoff to higher or lower values is straightforward.

      To ensure the robustness of our conclusions, we evaluated several thresholds for defining TB high and TB low. We observed that the conclusions hold within a reasonable range of values (100-250). For example, the effects on protein variation and the association of intrinsic features such as UTR lengths with buffering potential remain consistent when TB high is defined as the top 100 or the top 200 genes, compared with the current cutoff of 250. In contrast, when we define TB high as the top 2000 and TB low as ranks 2000–4000, the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D). Our results show that highly ranked genes consistently associate with specific features, suggesting an underlying hierarchy in translational buffering potential.

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies

      We thank the reviewer for the feedback. We have now added further description of the modified compositional regression and the imputation strategy in the results section (line 94). Comparison to standard log-ratio TE estimates and their limitations has already been detailed in Liu et al. 2025, Nature Biotechnology. Therefore, in the current manuscript we specifically focus on the effect of the imputation strategy.

              Specifically, the modified imputation slightly improved concordance between the set of genes that are identified to be translationally buffered using the negative RNA-TE relationship or using RNA -Ribosome occupancy correlation (0.91 to 0.94). Further, we assessed the correlation between TE and protein abundance as measured by mass spectrometry from seven human cell lines (A549, HEK293, HeLa, HepG2, K562, MCF7 and U2OS). The protein measurements were obtained from PaxDb. The new imputation strategy slightly increased mean correlation between the TE and proteome abundance as compared to naive strategy. It specifically showed improved correlation for HepG2, A549 and HeLa cell lines. 3507 genes were used for this analysis that were common between PaxDb, Liu et al., 2005 and the current study.
      

      Legend: Proteomics vs TE correlation of cell types without or with imputation strategy. Spearman correlation between compositional TE calculated as calculated by Liu et al., 2025 from 68 samples from 11 studies (HEK293), 86 samples from 10 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), five samples from two studies (MCF7), seven samples from two studies (K562) and 10 samples from two studies (HepG2) or from the current study. 57 samples from 10 studies (HEK293), 82 samples from 9 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), 5 samples from two studies (MCF7), one samples from one studies (K562) and 9 samples from two studies (HepG2) . 3507 genes were used for this analysis that were common between Paxdb, Liu et al., 2005 and the current study.

      Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets

      We thank the reviewer for the suggestion and agree that heterogeneity could potentially mask cell type-specific buffering effects. The TB-high genes we report are those that show consistent and robust expression across diverse contexts. However, unlike RNA-seq datasets, the current number of ribosome profiling samples per cell type is still limited, and a more comprehensive assessment of context-specific buffering will require larger datasets that will accumulate over time.

      Nonetheless, we have stratified the analysis by cellular context. Specifically, we grouped samples of the same cell-type and repeated the buffering analysis. We provide a new table listing TB-ranks of genes for the five cell types with the largest sample sizes as a table in github.

      https://github.com/CenikLab/Translational-buffering/blob/Translational-Buffering/combined_tables.xlsx

      As an additional control, we compared buffering patterns between related and unrelated cell lines. For example, the correlation of TB ranks between related cell lines HEK293T (n = 98) and HEK293 (n = 57) is higher (0.46) than between either and an unrelated cell line, HeLa (n = 82). Similarly, the correlation between two liver cell lines, Huh7 (n = 39) and HepG2 (n = 9), is higher (0.20) than between Huh7 and a similarly sampled but unrelated lymphoblastoid cell line (LCL, n = 9; correlation = 0.05). While these analyses suggest that cell type-specific patterns may exist, their exploration is currently limited by sample size, as detecting buffering requires substantial variability in mRNA expression. We now highlight this as a limitation in the Discussion section (line 573).

      *Legend: Spearman correlation between TB ranks of different pairs of cell lines. The first set indicates comparison with HEK293T. The second set indicates comparison between liver cells (HepG2 and Huh 7). *

      The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry

      We thank the reviewer for suggesting experiments to validate the proposed models. In the luciferase reporter experiments, constructs bearing the endogenous UTRs from non-buffered genes would be expected to result in expression that is proportional to transcript abundance. In contrast, swapping a 5’ UTR from buffered genes would mitigate this effect of translation buffering via “initiation rate model” depending on the 5 UTR sequence of transcript. However, as outlined below, this experiment has important caveats:

      1. Role of coding sequence: Such assays primarily test the contribution of the 5′UTR and do not address potential cooperative effects between the 5′UTR and the coding sequence (CDS). Thus, if 5′UTRs fails to recapitulate translational buffering, it would be unclear whether the buffering requires coordinated action of the 5′UTR and CDS or whether the gene in question simply does not conform to the initiation-rate model.
      2. Sensitivity of measurements: Reporter-based measurements often rely on RT-qPCR to quantify expression changes. While suitable for large fold-changes, small shifts may fall within the assay’s technical margin of error, limiting the interpretability of the results. iii. Gene-to-gene variability: Buffered and non-buffered transcripts likely span a wide range of intrinsic initiation rates. Selecting only a few “representative” transcripts for 5′UTR swapping could yield results that are not broadly generalizable.

      Similarly, knockdown of general initiation factors will likely impact on both buffered and non-buffered genes, which could limit the ability to distinguish the effect of transcript abundance on translational buffering via either of the proposed models. We envision an alternative future approach that would involve single molecule imaging translating and non-translating mRNAs of buffered and non-buffered genes under varying abundance conditions in a physiological context. Such experiments are likely the most suitable for disentangling the contributions of accessibility versus initiation. While we find this an exciting direction for future work, it lies beyond the scope of the present manuscript.

      The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability

      We agree with the reviewer’s concern and have been acknowledged as a limitation in the discussion section. To address this with orthogonal approaches, we carried out several additional experiments. Specifically, we identified a study from RiboBase (GSE132703) that exhibited significant variation in FUS transcript (a translationally buffered gene) abundance across conditions—namely HEK293T wild type, LARP1A single knockout (SKO), and LARP1A/B double knockout (DKO) using their RNA-seq data. We reached out to the authors of the study and obtained these knockout cell lines. We reanalyzed RNA abundance under the different conditions by RT-qPCR and assessed protein levels by Western blot. Despite observing differences in RNA abundance, FUS protein levels did not exhibit corresponding change at the protein level.

      We also selected a non-buffered gene; DNAJC6, that also showed RNA-level differences. However, the change in RNA expression was not consistent at the protein level. Some caveats of Western blot is its limited sensitivity which may prevent detection of subtle changes and that the measurements are steady-state protein levels which cannot resolve whether differences arise from altered synthesis or degradation.

      *Legend : Validation of buffering gene by western blot: A. Plot showing the RNA abundance and ribosome occupancy of buffered gene ; FUS and non buffered genes; DNAJC6 with variation in HEK293T-wild type, LARP1A single knockout and LARP1A/B double knockout. B. Validation of the RNA seq data by qPCR. C. Western Blot showing the FUS, DNAJC6 and Actin in wild type and different mutants. D. Bar plot showing the quantification of western blot. *

              In addition to this targeted analysis , we performed quantitative mass spectrometry to evaluate the effect of mRNA variation at the protein level at global scale.
      

      LC MS/MS analysis was performed on the above samples in triplicates at the Proteomics facility of the University of Texas. A total of 4,048 proteins were identified using a peptide confidence threshold of 95% and a protein confidence threshold of 99%, with a minimum of two peptides required for identification. Total precursor intensities for all peptides of a protein was summed and was used for protein quantification using DEP (Differential Enrichment of Proteomics Analysis) Package, in Bioconductor, R (https://rdrr.io/bioc/DEP/man/DEP.html). DEP was used for variance normalization and statistical testing of differentially expressed proteins. As expected LARP1 protein was identified in the control cells but not in the single or double knockouts.

      We then plotted the fold change in RNA as determined by edgeR analysis of RNA-seq from (Philippe et al. 2020) and the fold change in protein abundance from our mass spectrometry data. We observed that genes in the TB high group show reduced changes at the protein level compared to TB low or others as determined by the linear regression analysis in both single and double LARP1 KO mutants. This finding is consistent with our findings that buffered genes show lower variation in the protein abundance in response to change in mRNA expression.

      Legend: Scatter plot showing the log2fold change in the RNA and protein levels as determined by RNA seq from (Philippe et al. 2020) or mass spectroscopy. Differential analysis of RNA was done using the edgeR package and the DEP (Differential Enrichment of Proteomics Analysis) Package *was used for mass spectrometry analysis. Only genes with an FDR We have not included this data in the manuscript given the deviation of the approach from our original analysis, but we are happy to reconsider the inclusion of this data to supplement our proteomic analysis.

      While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses,

      We thank the reviewer for this suggestion. To investigate potential trans-acting determinants of buffering, we focused on 1,394 human RBPs as classified by Hentze et al. (2018), reasoning that some of these factors may facilitate translational buffering. Specifically, we examined correlations between the RNA expression of each RBP and the TE of all other genes across samples. p-values were corrected using the Bonferroni procedure. For each RBP, we then performed a Fisher’s exact test to assess whether the number of significant correlations was enriched among buffered versus non-buffered genes.

      This analysis revealed that the expression levels of many RBPs are significantly enriched for either positive or negative correlations with the TE of buffered genes. In particular, we note that RNA expression of many buffered RBPs is enriched for negative correlations with the TE of other buffered transcripts. These results suggest that, rather than considering translational buffering in isolation for each transcript, buffering effects may be coordinated at the translational level and influenced by shared trans-acting factors such as RBPs. Network-based approaches have been valuable for RNA co-expression and are only now being applied to TE covariation. However, the correlative nature of these analyses limits causal inference. For example, although many ribosomal proteins appear to influence the buffering of other ribosomal proteins, they themselves may be regulated by a non-ribosomal RBP—so the apparent effects could reflect upstream regulatory influences. This analysis is now included as a supplementary figure (Sup. Fig. 5) of the revised manuscript.

      Legend: A scatter plot of odds ratio log of number of significant correlations (RNA abundance of RBPs ::TE of genes) and the p value from fisher test. The vertical dashed line represents the threshold odds ratio, above which RBPs exhibit a higher number of significant correlations with buffered genes. P values were corrected using Bonferroni procedure* and the horizontal dashed line represents the adjusted p value cutoff. *

      Reviewer #2 (Significance (Required)):

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

      We thank the reviewer for noting the broader significance of the work and for their constructive feedback.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering. Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds
      2. The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies
      3. Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets
      4. The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry
      5. The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability
      6. While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses

      Significance

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments:

      1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html
      2. In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.
      3. The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.
      4. The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)
      5. Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level:

      DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4 6. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs. 7. Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range. 8. "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.
      2. There are several typos or weird sentences. Here are some (but maybe not all):

      2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...]

      2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples

      2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here.

      2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point. 3. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before. 4. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here? 5. For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends. 6. In Figure 2A, the "all genes" color doesn't correspond to the point color. 7. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910 8. "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq. 9. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained. 10. The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low. 11. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained. 12. The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading. 13. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      SECTION A - Evidence, Reproducibility, and Clarity Summary The study investigates the neurodevelopmental impact of trisomy 21 on human cortical excitatory neurons derived from induced pluripotent stem cells (hiPSCs). Key findings include a modest reduction in spontaneous firing, a marked deficit in synchronized bursting, decreased neuronal connectivity, and altered ion channel expression-particularly a downregulation of voltage‐gated potassium channels and HCN1. These conclusions are supported by a combination of in vitro calcium imaging, electrophysiological recordings, viral monosynaptic tracing, RNA sequencing, and in vivo transplantation with two‐photon imaging.

      Major Comments • Convincing Nature of Key Conclusions: The study's conclusions are generally well supported by a diverse set of experimental approaches. However, certain claims regarding the intrinsic properties of the excitatory network would benefit from further qualification. In particular, the assertion that reduced synchronization is solely attributable to altered ion channel expression might be considered somewhat preliminary without additional corroborative experiments.

      1.1) We agree with the reviewer and now write in the abstract: 'Together, these findings demonstrate long-lasting impairments in human cortical excitatory neuron network function associated with Trisomy 21 .' And in the Introduction: 'Collectively, the observed changes in ion channel expression, neuronal connectivity, and network activity synchronization may contribute to functional differences relevant to the cognitive and intellectual features associated with Down syndrome.'

      • One major limitation of the current experimental design is the reliance on predominantly excitatory neuronal cultures derived from hiPSCs. Although the authors convincingly demonstrate differences in network synchronization and connectivity between trisomic (TS21) and control neurons, the almost exclusive focus on excitatory cells limits the physiological relevance of the in vitro network. In the developing cortex, interneurons and astrocytes play crucial roles in modulating network excitability, synaptogenesis, and plasticity. Therefore, incorporating these cell types-either through co-culture systems or through directed differentiation protocols that yield a more heterogeneous neuronal population-could help to determine whether the observed deficits are intrinsic to excitatory neurons or are compounded by a lack of proper inhibitory regulation and glial support. 1.2) Thank you for this thoughtful comment. We agree that interneurons and astrocytes are crucial for network function. To clarify, astrocytes are generated in this culture system, as we previously reported in our characterisation of the timecourse of network development using this approach (Kirwan et al., Development 2025). However, our primary goal was to first isolate and define the cell-autonomous defects intrinsic to TS21 excitatory neurons, minimizing the complexity introduced by additional neuronal types. This focused approach was chosen also because engineering a stable co-culture system with reproducible excitatory/inhibitory (E/I) proportions is a significant undertaking that extends beyond the scope of this initial investigation, and has proven challenging to date for the field. By establishing this foundational phenotype, our work complements prior studies on interneuron and glial contributions. Future studies building on this work will be essential to dissect the more complex, non-cell-autonomous effects within a heterogeneous network. Importantly, since our initial submission, two highly relevant preprints have emerged-including a notable study from the Geschwind laboratory at UCLA (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025), as well as our own complementary study Lattke et al, under revision, that highlight widespread transcriptional changes in excitatory cells of the human fetal DS cortex, providing strong validation for our central findings. This convergence of results from multiple groups underscores the timeliness and importance of our work.

      • Furthermore, the assessment of neuronal connectivity via pseudotyped rabies virus tracing, while innovative, has inherent limitations. The quantification of connectivity as a ratio of red-to-green fluorescence pixels may be influenced by differential viral infection efficiencies, variations in the expression levels of the TVA receptor, or even by the lower basal activity levels observed in TS21 cultures. Complementary approaches-such as electron microscopy for synaptic density analysis or functional connectivity measurements using multi-electrode arrays (MEAs)-could provide additional structural and functional insights that would validate the rabies tracing data. 1.3) Thank you for this constructive feedback. While we cannot formally exclude that TS21 cells might express the TVA receptor at lower levels due to generalized gene dysregulation, we infected all WT and TS21 cultures in parallel using identical virus preparations and titers to minimize technical variability. Crucially, we also addressed the potential confound of differential basal activity by performing the rabies tracing under TTX incubation (see Suppl. Fig. 7), which blocks network activity and ensures that viral spread reflects structural connectivity alone.

      While complementary methods like EM or MEA could provide additional insight, they fall outside the scope of the current study. We are confident that our rigorous controls validate our use of the rabies tracing method to assess structural connectivity.

      • Qualification of Claims: Some conclusions, particularly those linking specific ion channel dysregulation (e.g., HCN1 loss) directly to network deficits, might be better presented as preliminary. The authors could temper their language to indicate that while the evidence is suggestive, the mechanistic link remains to be fully established. 1.4) We have revised the text to more clearly indicate that the link between HCN1 dysregulation and network deficits is correlative and remains to be fully established. While our ex vivo recordings suggest altered Ih-like currents consistent with reduced HCN1 expression, we now present these findings as preliminary and hypothesis-generating, pending further functional validation. We write in the discussion: However, further targeted functional validation will be needed to confirm a causal link.

      • Need for Additional Experiments: Additional experiments that could further consolidate the current findings include: o Inclusion of Inhibitory Neurons or Co-culture Systems: Incorporating interneurons or astrocytes would help determine whether the observed deficits are solely intrinsic to excitatory neurons. See 1.2 o Alternative Connectivity Assessments: Complementing the rabies virus tracing with electron microscopy or multi-electrode array (MEA) recordings would add structural and functional validation of the connectivity differences. See 1.3 o Extended Temporal Profiling: Monitoring network activity over a longer developmental window would clarify whether the observed deficits represent a delay or a permanent alteration in network maturation. 1.5) In vivo we were able to track the cells for up to five months post-transplantation supporting the interpretation of a permanent alteration.

      • Reproducibility and Statistical Rigor: The methods and data presentation are largely clear, with adequate replication and appropriate statistical analyses. Nonetheless, a more detailed description of the experimental replicates, particularly regarding the viral tracing and in vivo transplantation studies, would enhance reproducibility. The availability of raw data and scripts for calcium imaging analysis would also further support independent verification. We thank the reviewer for these suggestions and we now provide a more detailed description of replicates. We also add the raw data.

      Minor Comments • Experimental Details: Minor revisions could include clarifying the infection efficiency and expression levels of the viral constructs used in connectivity assays to rule out technical variability.

      See 1.3

      • Literature Context: The authors reference prior studies appropriately; however, integrating a brief discussion comparing their findings with alternative DS models (e.g., organoids or other hiPSC-derived systems) would improve contextual clarity. We thank the reviewer for this helpful suggestion. We have now added a brief discussion comparing our findings with those reported in alternative Down syndrome models, including brain organoids and other hiPSC-derived systems. This addition helps to contextualize our results within the broader field and highlights the unique strengths and limitations of our in vitro and in vivo xenograft approach. We write: 'Our findings align with and extend previous studies using alternative Down syndrome models, such as brain organoids and other hiPSC-derived systems. Organoid models have provided valuable insights into early neurodevelopmental phenotypes in DS, including altered interneuron proportions (Xu et al Cell Stem Cell 2019) but also suggest that variability across isogenic lines can overshadow subtle trisomy 21 neurodevelopmental phenotypes (Czerminski et al Front in Neurosci 2023). However, these systems often lack the structural complexity, vascularization, and long-term maturation achievable in vivo. By using a xenotransplantation model, we were able to assess the maturation and functional properties of human neurons within a physiologically relevant environment over extended time frames, offering complementary insights into DS-associated circuit dysfunction (Huo et al Stem Cell Reports 2018; Real et al., 2018).

      • Presentation and Clarity: Figures are generally clear,.But the manuscript contains a minor labeling error. On page 13, the figure is erroneously labeled as "Fig6A", whereas, based on the context and corresponding data, it should be "Fig5A". I recommend that the authors correct this mistake to ensure consistency and avoid potential confusion for readers. Thank you for pointing this out. This has been corrected in the revised manuscript.

      Reviewer #1 (Significance (Required)):

      SECTION B - Significance • Nature and Significance of the Advance: The work offers a substantial conceptual advance by providing a mechanistic link between trisomy 21 and impaired neuronal network synchronization. Technically, the study integrates state-of-the-art imaging, electrophysiology, and transcriptomic profiling, thereby offering a multifaceted view of DS-related neural dysfunction. Clinically, the findings have the potential to inform future therapeutic strategies targeting network connectivity and ion channel function in Down syndrome.

      We thank the reviewer for this very supportive comment.

      • Context in the Existing Literature: The study builds on previous observations of altered network activity in DS patients and DS mouse models (e.g., altered EEG synchronization and reduced synaptic connectivity). It extends these findings to human-derived neuronal models, thus bridging a gap between clinical observations and molecular/cellular mechanisms. Relevant literature includes studies on DS neurodevelopment and the role of ion channels in synaptic maturation. • Target Audience: The reported findings will be of interest to researchers in neurodevelopmental disorders, Down syndrome, and ion channel physiology. Additionally, the study may attract the attention of those working on hiPSC-derived models of neurological diseases, as well as clinicians interested in the pathophysiology of DS. • Keywords and Field Contextualization: Keywords: Down syndrome, trisomy 21, neuronal connectivity, synchronized network activity, hiPSC-derived cortical neurons, ion channel dysregulation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21). Major points: Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions. (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated. 2.1) We thank the reviewer for this thoughtful comment. In response, we included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging (see Supplementary Fig. 6).

      Previous work has identified several cellular and molecular phenotypes in human cells, postmortem tissue, and mouse models-including those mentioned by the reviewer. In this study, our focus was on investigating neural network activity, intrinsic electrophysiological properties both in vitro and in vivo, and preliminary bulk RNA sequencing. We have also independently measured cell proportions in the human fetal cortex and conducted a more extensive transcriptomic analysis of Ts21 versus control cells in a separate study (Lattke et al., under revision). We observed a reduction of RORB/FOXP1-expressing Layer 4 neurons in the human fetal cortex at midgestation, as well as increased GFAP+ cells, reduced progenitors and a non significant reduction of Cux2+ cells in late stage DS human cell transplants, along with a gene network dysregulation specifically affecting excitatory neurons (Lattke et al., under revision). Here, we provide complementary findings, demonstrating reduced excitatory neuron network connectivity in vitro and decreased neural network synchronised activity in both in vitro and in vivo models (see also 2.8). We agree with the reviewer that this could be for a number of reasons, both cell autonomous (channel expression and/or function) or non-autonomous (connectivity and/or network composition - as reflected in differences in proportions of SATB2+ neurons generated in TS21 cortical differentiations).

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      2.2) Thank you for this thoughtful comment. We have also conducted ex vivo electrophysiological recordings and found that the neurons exhibit relatively immature properties, consistent with the known slow developmental trajectory of human neuron cultures. In light of this and the absence of direct confirmatory evidence, we now refer to the observed reduction in HCN1 as preliminary.

      Main points highlighting the preliminary character of the study. 1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2.3) See 2.1. We included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging. (see Supplementary Fig. 6).

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      2.4) We thank the reviewer for this comment. We now add the power spectra analysis in the main Figure 2 and quantification of the mean calcium burst rate and mean event amplitude in SuppFig. 4.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      2.5) We thank the reviewer for this important observation. The difference from the findings reported in Kirwan et al., 2015 is due to the use of a different neuronal differentiation medium in the current study (BrainPhys versus N2B27). BrainPhys medium supports robust early network activity compared to N2B27 (onset before day 60 in BrainPhys, post-day 60 in N2B27), resulting in an earlier decline in synchrony at later stages (day 70-80 in BrainPhys, compared with day 90-100 in N2B27). Importantly, in our in vivo xenograft model, burst activity is sustained up to at least 5 months post-transplantation (mpt), indicating that the neurons retain the capacity for network activity over extended periods in a more physiological environment. We adapted the text accordingly.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      2.6) We thank the reviewer for these valuable points. We now include quantification of the number and density of transplanted neurons for both WT and Ts21 grafts in Extended Data Figure 5 (see 2.1).

      Regarding the in vivo calcium imaging, we appreciate the reviewer's suggestion to include additional standard metrics. We have quantified the event rate in Real et al 2018. These analyses reveal that Ts21 neurons show a reduction in event rate.

      We agree that our initial description of the synchrony analysis using mean pixel correlation was not sufficiently detailed. We have now clarified this in the Methods and Results, and we acknowledge its limitations. Importantly, we note that the reduced synchronisation is a highly consistent phenotype, observed across at least six independent donor pairs, different differentiation protocols, and both in vitro (and in two independent labs) and in vivo settings. As suggested, future studies using ROI-based approaches-such as cross-correlation or spike-time tiling coefficients-would provide a more refined characterization of synchrony at the single-neuron level (Sintes et al, in preparation). We now include this point in the discussion.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      We now add Tuj1 staining in Supplementary figure 10.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      We now show volcano plots in Supplementary Fig. 11.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      2.7) We thank the reviewer for this detailed and helpful comment. We agree that to definitively identify the recorded currents as Ih, it would be necessary to isolate them pharmacologically using specific HCN channel blockers and appropriate controls, such as those described in Matt et al., Cell. Mol. Life Sci. Unfortunately, due to current constraints, we no longer have access to the animals used in this study and cannot allocate the necessary time or resources, we are unable to perform the additional experiments at this stage.

      However, our goal here was to use electrophysiological recordings as an indication of altered HCN channel activity, which we then support with molecular evidence. We now emphasize this point more clearly in the revised manuscript.

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      We now clarify the numbers in the Figure legend.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: • Only electrophysiology methods for slice are reported, but not for in vitro culture.

      We now clarify these details in the methods.

      • Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? We now clarify these details in the methods.

      • How long cells were switched to BrainPhys medium before calcium imaging ? We now clarify these details in the methods.

      Minor point/typos etc.

      Introduction • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of". We have fixed this. • Page 5 line 2: please remove "an" before the word "another". We have fixed this. • Page 5 line 2: please replace "ecitatory" with "excitatory". We have fixed this typo.

      Results • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment. • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E." We have fixed this. Discussion • Page 15 line 20: please replace "synchronised" with "synchronized". We have fixed this typo. • Page 16 line 11: please replace "T21" with "TS21". We have fixed this typo. Methods • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep. We have fixed this typo. • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience". We have fixed this typo. • Page 21 line 2: "Addegene" has to be replaced by "Addgene". We have fixed this typo. Figures • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below. We have fixed this. • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below. We have fixed this. • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs. We have fixed this. • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption. We have fixed this.

      Reviewer #2 (Significance (Required)):

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      2.8) We thank the reviewer for this comment. While we agree that current deregulation has been observed in mouse models of Down syndrome, the novelty and significance of our study lie in demonstrating these alterations directly in human neurons using both in vitro and in vivo xenograft models.

      This is a critical advance because the human cortex has distinct developmental and functional properties not fully recapitulated in mice. In fact, three recent studies have already highlighted significant defects mainly in excitatory neurons within the fetal human DS cortex (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025; Lattke et al, under revision). Our work builds directly on these observations by providing, for the first time, an electrophysiological and network-level characterization of these human-specific deficits.

      Our findings thus provide translationally relevant insight that is not merely confirmatory but extends previous work by grounding it in a human cellular context.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21).

      Major points:

      Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions.

      (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated.

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      Main points highlighting the preliminary character of the study.

      1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: - Only electrophysiology methods for slice are reported, but not for in vitro culture. - Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? - How long cells were switched to BrainPhys medium before calcium imaging ?

      Minor point/typos etc.

      Introduction

      • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of".
      • Page 5 line 2: please remove "an" before the word "another".
      • Page 5 line 2: please replace "ecitatory" with "excitatory"

      Results

      • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment.
      • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E."

      Discussion

      • Page 15 line 20: please replace "synchronised" with "synchronized".
      • Page 16 line 11: please replace "T21" with "TS21".

      Methods

      • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep.
      • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience".
      • Page 21 line 2: "Addegene" has to be replaced by "Addgene".

      Figures

      • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below.
      • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below.
      • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs.
      • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption.

      Significance

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      Work in the context of the existing literature. This work follows the line of evidence that characterizes Down Syndrome in human neurons (Huo, H.-Q. et al. Stem Cell Rep. 10, 1251-1266 (2018); Briggs, J. A. et al. Etiology. Stem Cells 31, 467-478 (2013)), both in vitro and in xenotransplanted mice, by corrborating some important findings already found in animal models (Stern, S., Segal, M. & Moses, E. EBioMedicine 2, 1048-1062 (2015); Cramer, N. P., Xu, X., F. Haydar, T. & Galdzicki, Z. Physiol. Rep. 3, e12655 (2015); Stern, S., Keren, R., Kim, Y. & Moses, E. http://biorxiv.org/lookup/doi/10.1101/467522 (2018) doi:10.1101/467522.

      Audience. Scientists in the field of pre-clinical biomedical research, especially those working on neurodevelopmental disorders and iPSC-based non-animal models.

      Field of expertise. In vitro electrophysiology, Neurodevelopmental disorders, Down Syndrome, ips cells.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The study investigates the neurodevelopmental impact of trisomy 21 on human cortical excitatory neurons derived from induced pluripotent stem cells (hiPSCs). Key findings include a modest reduction in spontaneous firing, a marked deficit in synchronized bursting, decreased neuronal connectivity, and altered ion channel expression-particularly a downregulation of voltage‐gated potassium channels and HCN1. These conclusions are supported by a combination of in vitro calcium imaging, electrophysiological recordings, viral monosynaptic tracing, RNA sequencing, and in vivo transplantation with two‐photon imaging.

      Major Comments

      • Convincing Nature of Key Conclusions: The study's conclusions are generally well supported by a diverse set of experimental approaches. However, certain claims regarding the intrinsic properties of the excitatory network would benefit from further qualification. In particular, the assertion that reduced synchronization is solely attributable to altered ion channel expression might be considered somewhat preliminary without additional corroborative experiments.
      • One major limitation of the current experimental design is the reliance on predominantly excitatory neuronal cultures derived from hiPSCs. Although the authors convincingly demonstrate differences in network synchronization and connectivity between trisomic (TS21) and control neurons, the almost exclusive focus on excitatory cells limits the physiological relevance of the in vitro network. In the developing cortex, interneurons and astrocytes play crucial roles in modulating network excitability, synaptogenesis, and plasticity. Therefore, incorporating these cell types-either through co-culture systems or through directed differentiation protocols that yield a more heterogeneous neuronal population-could help to determine whether the observed deficits are intrinsic to excitatory neurons or are compounded by a lack of proper inhibitory regulation and glial support.
      • Furthermore, the assessment of neuronal connectivity via pseudotyped rabies virus tracing, while innovative, has inherent limitations. The quantification of connectivity as a ratio of red-to-green fluorescence pixels may be influenced by differential viral infection efficiencies, variations in the expression levels of the TVA receptor, or even by the lower basal activity levels observed in TS21 cultures. Complementary approaches-such as electron microscopy for synaptic density analysis or functional connectivity measurements using multi-electrode arrays (MEAs)-could provide additional structural and functional insights that would validate the rabies tracing data.
      • Qualification of Claims: Some conclusions, particularly those linking specific ion channel dysregulation (e.g., HCN1 loss) directly to network deficits, might be better presented as preliminary. The authors could temper their language to indicate that while the evidence is suggestive, the mechanistic link remains to be fully established.
      • Need for Additional Experiments: Additional experiments that could further consolidate the current findings include:
        • Inclusion of Inhibitory Neurons or Co-culture Systems: Incorporating interneurons or astrocytes would help determine whether the observed deficits are solely intrinsic to excitatory neurons.
        • Alternative Connectivity Assessments: Complementing the rabies virus tracing with electron microscopy or multi-electrode array (MEA) recordings would add structural and functional validation of the connectivity differences.
        • Extended Temporal Profiling: Monitoring network activity over a longer developmental window would clarify whether the observed deficits represent a delay or a permanent alteration in network maturation.
      • Reproducibility and Statistical Rigor: The methods and data presentation are largely clear, with adequate replication and appropriate statistical analyses. Nonetheless, a more detailed description of the experimental replicates, particularly regarding the viral tracing and in vivo transplantation studies, would enhance reproducibility. The availability of raw data and scripts for calcium imaging analysis would also further support independent verification.

      Minor Comments

      • Experimental Details:

      Minor revisions could include clarifying the infection efficiency and expression levels of the viral constructs used in connectivity assays to rule out technical variability. - Literature Context:

      The authors reference prior studies appropriately; however, integrating a brief discussion comparing their findings with alternative DS models (e.g., organoids or other hiPSC-derived systems) would improve contextual clarity. - Presentation and Clarity:

      Figures are generally clear,.But the manuscript contains a minor labeling error. On page 13, the figure is erroneously labeled as "Fig6A", whereas, based on the context and corresponding data, it should be "Fig5A". I recommend that the authors correct this mistake to ensure consistency and avoid potential confusion for readers.

      Significance

      • Nature and Significance of the Advance:

      The work offers a substantial conceptual advance by providing a mechanistic link between trisomy 21 and impaired neuronal network synchronization. Technically, the study integrates state-of-the-art imaging, electrophysiology, and transcriptomic profiling, thereby offering a multifaceted view of DS-related neural dysfunction. Clinically, the findings have the potential to inform future therapeutic strategies targeting network connectivity and ion channel function in Down syndrome. - Context in the Existing Literature:

      The study builds on previous observations of altered network activity in DS patients and DS mouse models (e.g., altered EEG synchronization and reduced synaptic connectivity). It extends these findings to human-derived neuronal models, thus bridging a gap between clinical observations and molecular/cellular mechanisms. Relevant literature includes studies on DS neurodevelopment and the role of ion channels in synaptic maturation. - Target Audience:

      The reported findings will be of interest to researchers in neurodevelopmental disorders, Down syndrome, and ion channel physiology. Additionally, the study may attract the attention of those working on hiPSC-derived models of neurological diseases, as well as clinicians interested in the pathophysiology of DS. - Keywords and Field Contextualization:

      Keywords: Down syndrome, trisomy 21, neuronal connectivity, synchronized network activity, hiPSC-derived cortical neurons, ion channel dysregulation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the referees for taking time to review our manuscript. These reviews are positive, highlighting the novelty of our findings. The majority of comments are cosmetic, and we have added data in response to some technical points. We feel that some of the additional experiments proposed would not add significant methodological depth, and cross-commenting suggests that our referees agree. At present we are attempting antibody staining to quantify Tk peptide retention in the midgut, as per suggestion by reviewer #2.

      We enclose our point-by-point response to each referee's points, below.



      __Reviewer #1 __

      • Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done?
      • We have incorporated the requested information into legends for lifespan experiments.

      • Do the interventions shorten lifespan relative to the axenic cohort? Or do they prevent lifespan extension by axenic conditions? Both statements are valid, and the authors need to be consistent in which one they use to avoid confusing the reader.

      • We read these statements differently. The only experiment in which a genetic intervention prevented lifespan extension by axenic conditions is neuronal TkR86C knockdown (Figure 6B-C). Otherwise, microbiota shortened lifespan relative to axenic conditions, and genetic knockdowns extend blocked this effect (e.g. see lines 131-133). We have ensured that the framing is consistent throughout, with text edited at lines 198-199, 298-299, 311-312, 345-347, 408-409, 424-425, 450, 497-503.

      • TkRNAi consistently reduces lipid levels in axenic flies (Figs 2E, 3D), essentially phenocopying the loss of lipid stores seen in control conventionally reared (CR) flies relative to control axenic. This suggests that the previously reported role of Tk in lipid storage - demonstrated through increased lipid levels in TkRNAi flies (Song et al (2014) Cell Rep 9(1): 40) - is dependent on the microbiota. In the absence of the microbiota TkRNAi reduces lipid levels. The lack of acknowledgement of this in the text is confusing

      • We have added text at lines 219-222 to address this point. We agree that this effect is hard to interpret biologically, since expressing RNAi in axenics has no additional effect on Tk expression (Figure S7). Consequently we can only interpret this unexpected effect as a possible off-target effect of RU feeding on TAG, specific to axenic flies. However, this possibility does not void our conclusion, because an off-target dimunition of TAG cannot explain why CR flies accumulate TAG following TkRNAi We hope that our added text clarifies.

      • *I have struggled to follow the authors logic in ablating the IPCs and feel a clear statement on what they expected the outcome to be would help the reader. *

      • We have added the requested statement at lines 423-424, explaining that we expected the IPC ablation to render flies constitutively long-lived and non-responsive to A pomorum.

      • *Can the authors clarify their logic in concluding a role for insulin signalling, and qualify this conclusion with appropriate consideration of alternative hypotheses? *

      • We have added our logic at lines 449-454. In brief, we conclude involvement for insulin signalling because FoxO mutant lifespan does not respond to TkRNAi, and diminishes the lifespan-shortening effect of * pomorum*. However, we cannot state that the effects are direct because we do not have data that mechanistically connects Tk/TkR99D signalling directly in insulin-producing cells. The current evidence is most consistent with insulin signalling priming responses to microbiota/Tk/TkR99D, as per the newly-added text.

      • Typographical errors

      • We have remedied the highlighted errors, at lines 128-140.

      • I'd encourage the authors to provide lifespan plots that enable comparison between all conditions

      • We have plotted our figures in faceted boxes, because the number of survival curves that would need to be presented on the same axis (e.g. 16 for Figure 5) would not be intellegible. However we have ensured that axes on faceted plots are equivalent and with grid lines for comparison. Moreover, our approach using statistical coefficients (EMMs) enables direct quantitative comparison of the differences among conditions.

      Reviewer #2

      • Not…essential for publication…is it possible to look at Tk protein levels?
      • We have acquired a small amount of anti-TK antibody and we will attempt to immunostain guts associated with * pomorum and L. brevis*. We are also attempting the equivalent experiment in mouse colon reared with/without a defined microbiota. These experiments are ongoing, but we note that the referee feels that the manuscript is a publishable unit whether these stainings succeed or not.

      • it would be good to show that the bacterial levels are not impacted [by TkRNAi]

      • We have quantified CFUs in CR flies upon ubiquitous TkRNAi (Figure S5), finding that the RNAi does not affect bacterial load. New text at lines 138-139 articulates this point.

      • The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this?

      • As per response to Reviewer #1, we have added text at lines 219-222 to address this point.

      • Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned?

      • We have added another experiment showing longevity upon knockdown in conventional flies, using an independent TkRNAi line (Figure S3).

      • Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this

      • This comment relates to Figure 7G. We do see an effect of the knockdown in this experiment, so we believe that the knockdown is effective. However the direction of response is not consistent with our hypothesis so the experiment is not informative about the role of these cells. We therefore feel there is little to be gained by testing efficacy of knockdown, which would also be technically challenging because the cells are a small population in a larger tissue which expresses the same transcripts elsewhere (i.e. necessitating FISH).

      • Would it be possible to use antibodies for acetylated histones?

      • The comment relates to Figure 4C-E. The proposed studies would be a significant amount of work because, to our knowledge, the specific histone marks which drive activation in TK+ cells remain unknown. On the other hand, we do not see how this information would enrich the present story, rather such experiments would appear to be the beginning of something new. We therefore agree with Reviewer #1 (in cross-commenting) that this additional work is not justified.

      Reviewer #3

      • *In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary. *
      • We thank the reviewer sincerely for their keen eye, which has highlighted an error in the previous version of the figure. In revisiting this figure we have noticed, to our dismay, that the figures for GFP quantification were actually re-plots of the figures for (ac)K quantification. This error led to the discrepancy between statistics and graphics, which thankfully the reviewer noticed. We have revised the figure to remedy our error, and the statistics now match the boxplots and results text.

      • Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association

      • We selected Adh on the basis of our RNAseq analysis, which showed it was not different between AX and CV guts, whereas many commonly-used “housekeeping” genes were. We have now added a plot to demonstrate (Figure S2).

      • The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference

      • We have added the requested reference (Hung et al, 2020) at line 86.

      • Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.

      • We agree with reviewers 1-2 (in cross-commenting) that this proposal is non-trivial and not justified by the additional insight that would be gained. As described above, we are attempting to immunostain Tk, which if successful will provide a third line of evidence for regulation of Tk+ cells. However we note that we already have the strongest possible evidence for a role of these cells via genetic analysis (Figure 5).

      • While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (ie. Tk transcription or EE activity upregulation) by the microbiome on males.

      • As the reviewer recognises, maintaining axenic experiments for months on end is not trivial. Given the tendency for males either to simply mirror female responses to lifespan-extending interventions, or to not respond at all, we made the decision in our work to only study females. We have instead emphasised in the manuscript that results are from female flies.

      • TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D.

      • We disagree with this interpretation: the results do not show that TkR86C-RNAi recapitulates the effect of enteric Tk-RNAi. A potentially interesting interaction is apparent, but the data do not support a causal role for TkR86C. A causal role is supported only for TkR99D, knockdown of which recapitulates the longevity of axenic flies and TkRNAi flies. Therefore we feel that our current title is therefore justified by the data, and a more generic version would misrepresent our findings.

      • The difference between "aging" and "lifespan" should also be addressed.

      • The smurf phenotype is a well-established metric of healthspan. Moreover, lifespan is the leading aggregate measure of ageing. We therefore feel that the use of “ageing” in the title is appropriate.

      • If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.

      • Foxo nuclear localisation has already been shown in axenic flies (Shin et al, 2011). We have added text and citation at lines 402-403.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Marcu et al. demonstrate a gut-neuron axis that is required for the lifespan-shortening effects mediated by gut bacteria. They show that the presence of commensal bacteria-particularly Acetobacter pomorum-promotes Tk expression in the gut, which then binds to neuronal tachykinin receptors to shorten lifespan. Tk has also recently been reported to extend lifespan via adipokinetic hormone (Akh) signaling (Ahrentløv et al., Nat Metab 7, 2025), but the mechanism here appears distinct. The lifespan shortening by Ap via Tk seems to be partially dependent on foxo and independent of both insulin signaling and Akh-mediated lipid mobilization. Although the detailed mechanistic link to lifespan is not fully resolved, the experiment and its results clearly shows the involvement of the molecules tested. This work adds a valuable dimension to our growing understanding of how gut bacteria influence host longevity. However, there are some points that should be addressed.

      1. Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.
      2. In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary.
      3. If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.
      4. Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association and/or (2) compare the results using different genes as internal standard.
      5. While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (ie. Tk transcription or EE activity upregulation) by the microbiome on males.
      6. We also had some concerns regarding the wording of the title. Fig6B and C suggests that TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D. The difference between "aging" and "lifespan" should also be addressed. While the study shows a role for Tk in lifespan, assessment of aging phenotypes (eg. Climbing assay, ISC proliferation) beyond the smurf assay is required to make conclusions about aging.
      7. The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference, if available.

      Referees cross-commenting

      I agree with the other reviewers that the study has been done very well and hence additional experiments are not mandatory to be published such as calcium imaging. However, I still believe that testing Tk's elevation by the Ap in males should greatly increase the generality of the finding, no matter what the outcome would be. Too many studies use only females.

      Significance

      General assessment

      The main strength of this study is the careful and extensive lifespan analyses, which convincingly demonstrate the role of gut microbiota in regulating longevity. The authors clarify an important aspect of how microbial factors contribute to lifespan control. The main limitation is that the study primarily confirms the involvement of previously reported signaling pathways, without identifying novel molecular players or previously unrecognized mechanisms of lifespan regulation.

      Advance

      The lifespan-shortening effect of Acetobacter pomorum (Ap) has been reported previously, as has the lifespan-shortening effect of Tachykinin (Tk). However, this study is the first to link these two factors mechanistically, which represents a significant and original contribution to the field. The advance is primarily mechanistic, providing new insight into how microbial cues converge on host signaling pathways to influence ageing.

      Audience

      This work will be of particular interest to a specialized audience of basic researchers in ageing biology. It will also attract interest from microbiome researchers who are investigating host-microbe interactions and their physiological consequences. The findings will be useful as a conceptual framework for future mechanistic studies in this area.

      Field of expertise

      Drosophila ageing, lifespan, microbiome, metabolism

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The main finding of this work is that microbiota impacts lifespan though regulating the expression of a gut hormone (Tk) which in turn acts on its receptor expressed on neurons. This conclusion is robust and based on a number of experimental observation, carefully using techniques in fly genetics and physiology: 1) microbiota regulates Tk expression, 2) lifespan reduction by microbiota is absent when Tk is knocked down in gut (specifically in the EEs), 3) Tk knockdown extends lifespan and this is recapitulated by knockdown of a Tk receptor in neurons. These key conclusions are very convincing. Additional data are presented detailing the relationship between Tk and insulin/IGF signalling and Akh in this context. These are two other important endocrine signalling pathways in flies. The presentation and analysis of the data are excellent.

      There are only a few experiments or edits that I would suggest as important to confirm or refine the conclusions of this manuscript. These are:

      1. When comparing the effects of microbiota (or single bacterial species) in different genetic backgrounds or experimental conditions, I think it would be good to show that the bacterial levels are not impacted by the other intervention(s). For example, the lifespan results observed in Figure 2A are consistent with Tk acting downstream of the microbes but also with Tk RNAi having an impact on the microbiota itself. I think this simple, additional control could be done for a few key experiments. Similarly, the authors could compare the two bacterial species to see if the differences in their effects come from different ability to colonise the flies.
      2. The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this? Better clarification is required.
      3. With respect to insulin signalling, all the experiments bar one indicate that insulin is mediating the effects of Tk. The one experiment that does not is using dilpGS to knock down TkR99D. Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this, but as a minimum I would be a bit more cautious with the interpretation of these data.
      4. Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned? This would further clarify that there are no off-target effects that can account for the phenotypes.

      There are a few other experiments that I could suggest as I think they could enrich the current manuscript, but I do not believe they are essential for publication: 5. The manuscript could be extended with a little more biochemical/cell biology analysis. For example, is it possible to look at Tk protein levels, Tk levels in circulation, or even TkR receptor activation or activation of its downstream signalling pathways? Comparing Ax and CR or Ap and CR one would expect to find differences consistent with the model proposed. This would add depth to the genetic analysis already conducted. Similarly, for insulin signalling - would it be possible to use some readout of the pathway activity and compare between Ax and CR or Ap and CR? 6. The authors use a pan-acetyl-K antibody but are specifically interested in acetylated histones. Would it be possible to use antibodies for acetylated histones? This would have the added benefit that one can confirm the changes are not in the levels of histones themselves. 7. I think the presentation of the results could be tightened a bit, with fewer sections and one figure per section.

      Referees cross-commenting

      Reviewer 1

      I generally agree with this reviewer but for

      "I'm convinced by the data showing that FOXO is required for TkRNAi to prevent lifespan shortening by Ap, but FOXO doesn't only respond to insulin signalling and can't be taken by itself to indicate a role for insulin signalling which the authors appear to do here."

      To the best of my knowledge, Foxo has only been shown to be required for lifespan extension/modulation by a reduction in insulin-like signalling. I.e. it does respond to other pathways but this is the only one where Foxo activity is known to modulate lifespan.

      Reviewer 3

      I agree with reviewer 1 that point raised under (1) does not appear strictly required for the conclusions of the manuscript.

      Both reviewers 1 and 3:

      I have a different take on the results of experiments where IPCs are manipulated. To me, Figure 7D and E show that ablating the IPCs removes the difference between Ax and Ap i.e. the IPCs are involved and insulin-like signalling is likely involved. The fact that RNAi against the TKR99D receptor does not have the same effect, does not matter (the sensing could happen in different neurons). Similarly, dilp expression is only a minor readout of what is happening with insulin-like signalling - dilps are controlled at the level of secretion.

      However, I would be happy for the authors to present different arguments and make a reasonable conclusion, which may differ from mine. But I think the arguments I present above should be taken into account.

      Significance

      The main contribution of this manuscript is the identification of a mechanism that links the microbiota to lifespan. This is very exciting and topical for several reasons:

      1) The microbiota is very important for overall health but it is still unclear how. Studying the interaction between microbiota and health is an emerging, growing field, and one that has attracted a lot of interest, but one that is often lacking in mechanistic insight. Identifying mechanisms provides opportunities for therapies. The main impact of this study comes from using the fruit fly to identify a mechanism.

      2) It is very interesting that the authors focus on an endocrine mechanism, especially with the clear clinical relevance of gut hormones to human health recently demonstrated with new, effective therapies (e.g. Wegovy).

      3) Tk is emerging as an important fly hormone and this study adds a new and interesting dimension by placing TK between microbiota and lifespan.

      I think the manuscript will be of great interest to researchers in ageing, human and animal physiology and in gut endocrinology and gut function.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study the authors use a Drosophila model to demonstrate that Tachykinin (Tk) expression is regulated by the microbiota. In Drosophila conventionally reared (CR) flies are typically shorter lived than those raised without a microbiota (axenic). Here, knockdown of Tk expression is found to prevent lifespan shortening by the microbiota and the reduction of lipid stores typically seen in CR flies when compared to axenic counterparts. It does so without reducing food intake or fecundity which are often seen as necessary trade-offs for lifespan extension. Further, the strength of the interaction between Tk and the microbiota is found to be bacteria specific and is stronger in Acetobacter pomorum (Ap) monoassociated flies compared to Levilactobacillus brevis (Lb) monoassociation. The impact on lipid storage was also only apparent in Ap-flies. Building on these findings the authors show that gut specific knockdown is largely sufficient to explain these phenotypes. Knockdown of the Tk receptor, TkR99D, in neurons recapitulates the lifespan phenotype of intestinal Tk knockdown supporting a model whereby Tk from the gut signals to TkR99D expressing neurons to regulate lifespan. In addition, the authors show that FOXO may have a role in lifespan regulation by the Tk-microbiota interaction. However, they rule out a role for insulin producing cells or Akh-producing cells suggesting the microbiota-Tk interaction regulates lifespan through other, yet unidentified, mechanisms.

      Major comments:

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. In particular, the impact of TkRNAi on lifespan and lipid levels, the central finding in this study, has been demonstrated multiple times in different experiments and using different genetic tools. As a result, I don't feel that additional experimental work is necessary to support the current conclusions. However, I find it hard to assess the robustness of the lifespan data from the other manipulations used (TkR99DRNAi, TkRNAi in dFoxo mutants etc.) because information on the population size and whether these experiments have been replicated is lacking. Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done? For all other data it is clear how many replicates have been done, and the methods give enough detail for all experiments to be reproduced.

      Minor comments:

      While I feel the conclusions of this study are well supported by the data I found this to be a complex read and in places hard to follow. I feel some work is necessary in the writing to help the reader follow the authors logic. Below I describe some of the issues that confused me and provide some suggestions that I hope the authors will find helpful.

      Survival curves The authors state that the lifespan difference between CR and axenic flies disappears with TkRNAi because TkRNAi CR flies are longer lived, rather than because TkRNAi axenic flies are shorter lived. Is this consistent in every TkRNAi experiment? It's hard for the reader to assess this because the relevant lifespan curves are presented on separate plots. I'd encourage the authors to provide lifespan plots that enable comparison between all conditions. For example, in figures 2 and 6 the reader wants to directly compare between RU- and RU+ but can't easily do so. Additional plots could be made available in the supplementary figures showing the comparisons that are not easy to make on the main figures.

      Consistent framing of the data Do the interventions shorten lifespan relative to the axenic cohort? Or do they prevent lifespan extension by axenic conditions? Both statements are valid, and the authors need to be consistent in which one they use to avoid confusing the reader. For example, line 325 says TkR86CRNAi prevents lifespan extension in axenic flies. Given the framing in the previous sections, it might be clearer to say that TkR86CRNAi shortens the lifespan of axenic flies to that of CR flies in contrast to TkRNAi and TkR99DRNAi which don't.

      The impact of TkRNAi on lipid levels in axenic flies TkRNAi consistently reduces lipid levels in axenic flies (Figs 2E, 3D), essentially phenocopying the loss of lipid stores seen in control conventionally reared (CR) flies relative to control axenic. This suggests that the previously reported role of Tk in lipid storage - demonstrated through increased lipid levels in TkRNAi flies (Song et al (2014) Cell Rep 9(1): 40) - is dependent on the microbiota. In the absence of the microbiota TkRNAi reduces lipid levels. The lack of acknowledgement of this in the text is confusing for the reader because it is inconsistent with the microbiota driving both higher Tk expression and higher lipid storage. If the microbiota increases Tk expression and this results in reduced lipid storage, why does reduced Tk expression also result in reduced lipid storage in axenic flies? This could further highlight the unique impact that the interaction between TkRNAi and the microbiota has on lipid storage, given it reverses both the impact of the microbiota alone and TkRNAi alone. I feel this aspect of the data should be given more attention in the text both for clarity and because it may be telling us something important about the function of Tk. The current framing around pleiotropic effects is valid, and the impact of Tk on lipid storage is clearly independent of its impact on lifespan and so is not central to this study. However, I feel a short additional paragraph to acknowledge this nuance of the data is needed. It can be made clear in the text that further exploration is beyond the scope of the current study.

      Role of insulin signalling and insulin producing cells I'm convinced by the data showing that FOXO is required for TkRNAi to prevent lifespan shortening by Ap, but FOXO doesn't only respond to insulin signalling and can't be taken by itself to indicate a role for insulin signalling which the authors appear to do here.

      I would expect ablation of IPCs to have the opposite effect to foxo mutation and to increase FOXO activity throughout the organism due to a reduction in Dilp levels and so reduced insulin signalling. So, I have struggled to follow the authors logic in ablating the IPCs and feel a clear statement on what they expected the outcome to be would help the reader. They find that TkRNAi still prevents lifespan shortening by Ap when IPCs are ablated and that TkR99DRNAi in IPCs also doesn't block lifespan shortening by Ap despite reducing the expression of dilp3 and dilp5. To me these data rule out a role for insulin signalling despite the requirement for FOXO and yet the authors conclude that insulin signalling is involved in the response to Ap and TkRNAi, although not obligately (lines 420 - 422 and 511 - 512). Can the authors clarify their logic in concluding a role for insulin signalling, and qualify this conclusion with appropriate consideration of alternative hypotheses? The potential involvement of other signalling inputs to FOXO activity, e.g. immune signalling and JNK, should be acknowledged and warrants some discussion.

      Typographical errors:

      Incomplete sentence line 121 to 122 - starting "Cox proportional hazards.... and posthoc tests (Fig 2b).

      Line 123 "EMMs" - define abbreviation on first use

      References to Fig 2b (first given on line 122), should be capitalised to Fig 2B for consistency.

      Lines 231 and 317 - the phrase "steady state (microbiota independent) expression" in reference to flyATLAS 2 data could be misleading. The term "microbiota independent" could suggest that expression levels have been shown not to be regulated by the microbiota and this is not the case. The authors should change this to simply state they are referring to steady state expression in conventionally reared flies.

      Referees cross-commenting

      Below are brief comments on the revision suggestions that reviewers 2 and 3 have requested.

      Reviewer 2

      1. I agree that confirmation that TkRNAi doesn't impact microbial levels could be helpful and would be straightforward for the authors to do. However, I don't feel it's essential to support the central claims of the paper.
      2. I agree.
      3. I don't feel that any of these experiments supports a role for insulin signalling, so I don't feel that this additional control is needed.
      4. It would be a good addition to have lifespan data from a separate knockdown line for corroboration. However, this has already been done in several different genetic backgrounds through crosses with different driver lines in multiple tissues, so I feel it's unnecessary given the time and resources that lifespan experiments take. There's also the caveat that different RNAi lines can knockdown to different extents so that would have to be assessed as well and if there's a difference it may mean that ultimately not much can be concluded from this additional experiment.
      5. A good suggestion, but not straightforward and depends on the availability of the necessary tools, or possibly the generation of new tools. One for a follow up study.
      6. I feel this is not important enough to the central findings of the study to warrant the extra work.
      7. I agree.

      Reviewer 3 1. Imaging calcium signalling is not straightforward unless a lab already has the tools available and optimised. If Tk+ EEs show changes in calcium signalling I'm not convinced that this tells us anything specific to the Tk-microbiota interaction. The point is the role of Tk itself, not the broader activity of the cells that express it. 2. I agree this needs clarification. 3. I agree that this would add depth, if feasible, but feel it's not essential to support the current conclusions. 4. This is a minor point and given the RT-qPCR data and the RNAseq data corroborate each other I'm convinced that Tk levels are elevated. 5. I feel exploring this in males is opening an additional line of enquiry beyond the scope of the current study. Either the phenotypes are the same - in which case what is added? - or they are different but there's no scope to assess why. A good suggestion for a follow up study. 6. No comment. 7. Agreed.

      One final comment. It's true that FOXO has only been shown to regulate lifespan in the context of insulin signalling. However, as far as I'm aware it hasn't been shown not to regulate lifespan downstream of it's other activators, this simply hasn't been explored due to the historical focus on insulin signalling in this field. In the context of host-microbiota interactions considering other pathways the activate FOXO, such as immune and JNK signals, would make sense.

      Reviewed by Dr Rebecca Clark, Department of Biosciences, Durham University

      Significance

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. We have known that the microbiota influence lifespan for some time but the mechanisms by which they do so have remained elusive. This study identifies one such mechanism and as a result opens several avenues for further research. The Tk-microbiota interaction is shown to be important for both lifespan and lipid homeostasis, although it's clear these are independent phenotypes. The fact that the outcome of the Tk-microbiota interaction depends on the bacterial species is of particular interest because it supports the idea that manipulation of the microbiota, or specific aspects of the host-microbiota interaction, may have therapeutic potential.<br /> These findings will be of interest to a broad readership spanning host-microbiota interactions and their influence on host health. They move forward the study of microbial regulation of host longevity and have relevance to our understanding of microbial regulation of host lipid homeostasis. They will also be of significant interest to those studying the mechanisms of action and physiological roles of Tachykinins.

      Field of expertise: Drosophila, gut, ageing, microbiota, innate immunity

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for providing thoughtful and constructive feedback, which will help us improve the clarity and rigor of the paper. On balance, the reviews were positive. Reviewer 1 mentioned that “This is a strong manuscript with few problems and all important findings well justified, indeed this is a nicely polished…..high-quality manuscript,” and that “this paper makes a major breakthrough, showing that cell autonomous defects in hTSCs are very likely at the heart of the pathology observed in GIN-prone murine mutants.” Reviewer 3 stated that “The study is well designed, and the manuscript is very well written. The conclusions are supported by the evidence presented.” Reviewer 2 was less enthusiastic, with main concerns being that “The paper is mostly descriptive and often quite confusing leaving one not much closer to understanding the mechanistic basis for the interesting sex-biased semi-lethal phenotype.” and felt that figure titles/section headers overstated the results, and finally recommended to improve some technical aspects and tempering conclusions. The proposed edits we think address most issues raised by the reviewers either with re-writing or adding data as described below.

      In response to reviewer #1 comments:

      Major comments:

      • I am confused as to the basis of the sex-skewing phenomenon? Is the problem that lack of maternally loaded WT Mcm4 worsens the phenotype, or is the issue that Mcm4C3/C3 dams are less able to retain pregnancies, perhaps being a more inflammatory environment? Also, while there quite consistent evidence for reduced viability of Mcm4C3/C3McmGt/+ progeny, especially for female progeny, how confident can we be that the genotype of the dam vs. sire is important? Notably on a Ddx58 background, the progeny of the Mcm4C3/C3 sire included seven live male Mcm4C3/C3McmGt/+ but no female.

      Regarding the first point (sex skewing only when female is C3/C3), we also suspected either: 1) the maternal uterine environment, or 2) reduced oocyte quality. Although not reported in this manuscript, we tested #1 by performing embryo transfer experiments. Transferring 2-cell stage embryos from sex-skewing mating to WT females did not rescue the sex-bias. We then examined oocytes from C3/C3 females. We found evidence for compromised mitochondria and transcriptome disruption. However, we are not sure why this happens (poor follicle support? Oocyte intrinsic phenomenon?). We are reserving these results and additional experiments for another paper, especially since this one mainly deals with GIN and placenta development. If the reviewers feel strongly that the embryo transfer data is crucial, we can include it.

      Regarding how confident we are that the genotype of the dam vs. sire is important, this stems from our previous paper by McNairn et al 2019 (the percentage of female C3/C3 M2/+ from sex-skewing mating is 20% compared to 60% from the reciprocal mating), which was quite dramatic. Consistent with this, MCM levels were significantly reduced in the placentae only when the dam was C3/C3 and the sire C3/+ M2/+, but not in the reciprocal cross. The reviewer makes a good observation about the Ddx58 cross; we can only hypothesize that the mutation somehow sensitizes females in this scenario and will make mention of it in the revision. We also realize that we neglected to write in Methods that the Ddx58 allele was coisogenic in the C3H background.

      • I'm not sure what Supplementary Figure 6 is showing (faster differentiation of C3 but less TGC?). Regardless, it's hard to draw too much conclusion from one not-very-pretty Western blot. This figure requires both additional replicates and a better explanation of how it fits with the other conclusions of the paper..

      We hypothesized that the JZ defect observed in the semi-lethal genotype placentas could arise either from impaired maintenance of the progenitor pool or from reduced capacity of mutant trophoblast progenitors to differentiate into the JZ lineage. The blot in Supplementary Figure 6 was intended as a qualitative demonstration that mutant trophoblast stem cells can differentiate into JZ lineages. We recognize that the figure is not definitive and will revise the text to clarify its purpose. A replicate(s) of the Western will be performed as suggested.

      • Supplementary Figure 7F-G is puzzling. Half of the mESCs have gamma-H2AX at all times, including most in S or G2 phase? In Figure S7E, do the quadrants correspond to being negative or positive for gamma-H2AX? At very least, IF images showing clear gamma-H2AX foci would be much more convincing.

      The gates for γH2AX FACS analysis were established using negative controls lacking primary antibody. As reported previously, embryonic stem cells display high basal levels of γH2AX staining (Chuykin et al., Cell Cycle 2008; Turinetto et al., Stem Cells 2012; Ahuja et al., Nat Comm 2016), which likely explains the broad signal observed across cell cycle phases. Regardless, we will provide immunofluorescence staining of γH2Ax and foci count in our revision.

      • The methods section is well detailed, but it would be ideal to clarify how many replicates each Western Blot or flow cytometry experiment is representative of.

      Thanks for the suggestion. We will update this for Fig4 and Fig5.

      Minor comments:

      • Is it possible that cGAS-STING and RIG pathways act redundantly to cause inflammation and lethality, or that other innate immune components are involved? I don't expect the authors to make compound mutants to test this but at least this possibility should be discussed textually.

      We appreciate the reviewer’s point, and had the same suspicion. Supporting this, we will add new RNA-seq analysis of Tmem173 KO placentas revealed elevated inflammatory gene expression compared to C3/C3 M2/+ controls, consistent with potential redundancy or feedback regulation. We will update in supplementary figures to reflect this.

      In response to reviewer #2 comments:

      Major comments:

      A major concern throughout the paper is that conclusions are often overstating their data. The title of figure 2 is "placentae with replication stress have smaller junctional and labyrinth zones". However, there is no measure of replication stress in this figure, just a histological evaluation of the placentae from the different mutants. The title of figure 3 is "Impact of GIN on LZ is less than JZ," but there is no measure of GIN, but instead measurement of number of cells in cell cycle and some bulk RNA-seq analysis. Title of figure 4 is "TSCs with increased genomic instability exhibit abnormal phenotypes." Again there is no measure of GIN, but instead staining of derived TSCs for proliferation, cell death, and a TSC marker. Title of figure 5 is "DNA damage responses and G2/M checkpoint activation drive premature TSC differentiation." However, there does not appear to be a difference in gH2AX between the two mutant genotypes. Checkpoint proteins might be up, but need quantification and reproduction. > 4C is the only marker of differentiation. Importantly, all the analyses here are associations, not connections, so cannot use the word "drive". Similar issues can be raised with a number of the supplementary figures.

      The Chaos3 (chromosome aberrations occurring spontaneously 3) model is a well-established system of intrinsic chronic replication stress and GIN. It is characterized by ~20 fold elevation of blood micronuclei (Shima et al., Nature 2007), a hallmark of GIN (Soxena et al., Mol Cell 2022); a destabilized MCM2-7 helicase prone to replication fork collapse (Bai et al., PLoS Genet 2016); and increased mitotic chromosome abnormalities and decreased dormant origins (Kawabata et al., Mol Cell 2011; Chuang et al., Nucleic Acid Res 2012) that are known to cause GIN and replication stress (Ibarra et al., PNAS 2008 ). Also, in our previous work (McNairn et al Nature 2019), we showed that placentae from C3/C3 dams exhibit significantly elevated γH2Ax as well as reduced MCM2 and MCM4 protein levels. In our current study, we also observe elevated γH2Ax in mutant TSCs (C3/C3 and C3/C3 M2/+), consistent with genomic instability. Nevertheless, we acknowledge that in TSCs, we did not formally demonstrate replications stress(RS), so where appropriate, we will advise figure titles, for example to say that “cells/placentae with a GIN or RS genotype.”

      We acknowledge the reviewers concern regarding western blots. We will provide quantification and statistics in our revision.

      1) A deeper analysis of the cell lines is likely to be the most fruitful path to reveal interesting mechanisms. It is very surprising that there is no phenotype in ESCs. Authors should check for increased apoptosis. Maybe the phenotypic cells are lost. Or do ESCs use different MCMs/mechanisms of DNA replication or are they better able to handle replication stress and GIN? How many passages were the TSCs and ESCs cultured for? Does GIN (i.e. aneuploidy, CNVs) develop in TSCs and ESCs with passaging? How do the MCM mutations impact the molecular identity of the ESC and TSC cells including their heterogeneity in the population.

      We assessed apoptosis using cleaved caspase 3 flow cytometry in mutant ESCs and observed no difference compared to controls (we will add this data as Supplementary Fig. 7).

      We believe there are intrinsic differences in TSCs and ESCs in their ability to respond to and counteract replication stress and DNA damage. ESCs are known to license more replication origins than somatic cells at a higher rate, which protects them from short G1-induced replication stress (Ahuja et al., Nat Comm 2016; Ge et al., Stem Cell Rep 2015; Matson et al., eLife 2017). Human placental cells physiologically exhibit high levels of mutation rate and chromosomal instability in vivo (Coorens et al., Nature 2021). Supporting this, Wang, D., et al (Nat Comm 2025) reported that several cell cycle and DDR regulators are differentially expressed in human TSCs vs human pluripotent stem cells. Whether such transcriptional differences directly contribute to functional outcomes remains to be determined.

      All experiments in this study were conducted using early-passage ESCs and TSCs (i.e. Finally, we showed that close to 90% mutant ESCs are KLF4+ (a naive pluripotency marker) whereas EOMES+ cells were significantly reduced in TSCs carrying the GIN genotype (Fig. 4E–F and Supplementary Fig. 7), highlighting lineage-specific differences.

      Minor Comments:

      1) There is a lack of quantification and repeats for all Westerns. At minimum there should be three repeats for each experiment, quantification including normalization to a reference protein, and stats confirming any proposed differences between conditions.

      We will update our revision with quantification and statistics for western blots.

      2) I would recommend moving the results in supp table 1 to figure 1. While negative, they are the newer results. The results shown in current figure 1 are essentially a reproduction of their previous work.

      The placental observations presented in Fig.1 are new. In particular, the placental and embryonic weight measurements graphed in Fig1B and C have not been published by our group. Fig1A reproduces our previous observation on embryo viability in GIN mutants (McNairn et al., Nature 2019), while the schematic was provided for better flow and readability given the complex mating schemes. We are agnostic on the Suppl Table 1. It could be changed to a new Table 1 in the main section depending on the journal.

      In response to reviewer #3 comments:

      Major Comments

      While the inclusion of bulk RNAseq data of whole placental tissue is appreciated, the interpretation of the results is somewhat problematic, as it is acknowledged that the cell type composition of the placentas is drastically different between groups. Making conclusions based upon GSEA analysis of two different groups with drastically different cell type composition is somewhat misleading, as based on the results, it is a direct reflection of the cell types present. It would be more helpful to perform cell type deconvolution of the RNAseq data to estimate the proportion of each cell type within the bulk samples and compare that to what is seen histologically and not dive too deeply into the pathways since the results could just be a reflection of the cell types e.g. angiogenesis pathways from more endothelial cells. Additionally, the RNAseq data can be leveraged to look at expression of inflammatory genes by sex, which may show interesting patterns based on the other results.

      We agree that the representation of cell types in the placenta is problematic especially for underrepresented genes. We propose to use the BayesPrism tool (Chu et al., Nat Cancer 2022) to deconvolute bulk RNA-seq for better representation of transcriptional changes in the placenta.

      Section: GIN impairs trophoblast stem cell establishment and maintenance. To support the assertion in the first paragraph, beyond measuring apoptosis, it would be helpful at this stage to look at RNA expression levels indicative of the activation of DNA damage checkpoint genes

      We have performed RNA-seq on mutant ESC and TSCs and are in the process of data analysis. We will update these results in the revision.

      Please include additional methodological details in the methods section on the statistical analysis done for differential expression analysis. Specifically, what type of normalization was used, if lowly expressed genes were filtered out and at what cutoff, what statistical model was used (did you include covariates?), what comparisons were made? Did you stratify by sex? What cutoff was used for statistical significance? Did you perform multiple testing correction?

      We will update RNA-Seq data analysis methods in our full revision.

      2. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1 comments:

      • Supplementary Table 1. would be enhanced greatly showing comparable tables for Mcm4C3/C3 x Mcm4C3/+McmGt/+ in mice without the Tmem173 or Ddx58 mutations. It is fine to recycle data from McNairn 2019 here, as long as the source is indicated, but a comparison is needed.

      Thanks for pointing this out. We have updated this suggestion in Supp table 1.

      • In Figure S3E-F, is the box above each graph supposed to show the genotype of the dam?

      Yes. Thanks for pointing this out. We have added a description in the figure legend to make it clear.

      • "Indeed, the placenta and embryo weights of E13.5 Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ animals were significantly improved vs. Mcm4C3/C3 Mcm2Gt/+ animals, rendering them similar to Mcm4C3/C3 littermates (Fig. 6A-C). The JZ (but not LZ) area in Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ placentae also increased to the level of Mcm4C3/C3 littermates (Fig. 6D-H)." There are two problems here. First, the figure calls are wrong. Second, the description of the data is not quite right, it looks like the C3/C3 and C3/C3 M2/+ M3/+ LZs are a similar size to each and are statistically indistinguishable.

      Thanks for catching this. We have updated these in the main text.

      *Reviewer #2 comments: *

      Minor comment

      • Need to review citations to figures. For example, no citations are made to figure 4a and 4c.

      Thanks for catching this. We have updated the text.

      Reviewer #3 comments:

      Define the first use of >4C DNA content to help readers understand this potentially unfamiliar term.

      We have edited this part to indicate cells with more than 4C DNA content for better clarity.

      iDEP tool - please include citation to manuscript instead of link

      We have updated this citation.

      Check citations. Some citations to BioRxiv that are now published e.g. 13.

      We have updated this citation.

      3. Description of analyses that authors prefer not to carry out

      Reviewer 2

      2) Along similar lines, most of the in vivo phenotypic analyses are performed at E13.5, long after defects are likely beginning to express themselves especially given that they see phenotypes in the TSCs, which represent the polar TE of a E4.5. To understand the primary defects of the in vivo phenotype, they should be looking much earlier. Supplemental figure 5 is a start but represents a rather superficial analysis.

      The peri-implantation period, namely E4.5, represents a “black box” of embryonic development given that this is a critical stage for implantation. Aside from being an extremely difficult stage to analyze technically, we don’t think it is essential to the conclusions (or doable in a timely manner), especially given the use of TSCs. If we complete EdU studies on E6.5 embryos, we will include them.

      3) Fig. 6 would benefit from evidence that MCM3 mutant is rescuing MCM4 levels in the chromatin fraction of cells and the DNA damage phenotype.

      The genetic evidence presented is strong, and although we didn’t do the suggested experiment, we feel that our previous studies (McNairn et al., Nature 2019 and Chuang et al., PLoS Genet 2010) on the effects of MCM3 as a nuclear export factor (as it is in yeast (Liku et al., Mol Biol Cell 2005)) are a reasonable basis for not repeating such experiments. Furthermore, we are no longer maintaining the Mcm3 line and it would take over a year to reconstitute and rebreed triple mutants.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript examines chronic replication stress-mediated genomic instability in placental development and concludes that it disrupts placental development in mice. The study is well designed and the manuscript is very well written. The conclusions are supported by the evidence presented. The manuscript would be improved by addressing the comments below.

      Major Comments:

      • While the inclusion of bulk RNAseq data of whole placental tissue is appreciated, the interpretation of the results is somewhat problematic, as it is acknowledged that the cell type composition of the placentas is drastically different between groups. Making conclusions based upon GSEA analysis of two different groups with drastically different cell type composition is somewhat misleading, as based on the results, it is a direct reflection of the cell types present. It would be more helpful to perform cell type deconvolution of the RNAseq data to estimate the proportion of each cell type within the bulk samples and compare that to what is seen histologically and not dive too deeply into the pathways since the results could just be a reflection of the cell types e.g. angiogenesis pathways from more endothelial cells. Additionally, the RNAseq data can be leveraged to look at expression of inflammatory genes by sex, which may show interesting patterns based on the other results.

      • Section: GIN impairs trophoblast stem cell establishment and maintenance. To support the assertion in the first paragraph, beyond measuring apoptosis, it would be helpful at this stage to look at RNA expression levels indicative of the activation of DNA damage checkpoint genes

      Minor Comments:

      • Define the first use of >4C DNA content to help readers understand this potentially unfamiliar term.

      • Please include additional methodological details in the methods section on the statistical analysis done for differential expression analysis. Specifically, what type of normalization was used, if lowly expressed genes were filtered out and at what cutoff, what statistical model was used (did you include covariates?), what comparisons were made? Did you stratify by sex? What cutoff was used for statistical significance? Did you perform multiple testing correction?

      • iDEP tool - please include citation to manuscript instead of link

      • Check citations. Some citations to BioRxiv that are now published e.g. 13.

      Significance

      The manuscript concludes that replication-stress induced genomic instability impairs placental development in mice. This is a significant advance in the field, as it mechanistically links genomic instability to placental development with further study needed in human trophoblast to establish clinical relevance. Strengths of this manuscript include solid study design, interpretation and presentation (both writing and figures). Weakness of the manuscript reside primarily in the RNAseq analysis results, methods and interpretation. The manuscript is of interest to audiences with interests in genome maintenance, development and placental biology. To contextualize this reviewer's point of view, this review is based on expertise in genomics, computational biology and placental biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript, "Chronic replication stress-mediated genomic instability disrupts placenta development in mice" by Munisha et al follows up a 2019 paper in Nature by the same group where they show that mutations to the MCM genes lead to a sex-skewed semi-lethal phenotype starting after embryonic day 9.5 and extending to birth. In the paper, they hypothesized that the semi-lethality is secondary to genomic instability (GIN) driven inflammation due to activation of the innate immune pathways sensing cytoplasmic DNA. In this paper, they start by disproving that hypothesis and then go on to present data arguing lethality is due to a placental development defect rather than inflammation. The paper is mostly descriptive and often quite confusing leaving one not much closer to understanding the mechanistic basis for the interesting sex-biased semi-lethal phenotype that was described in their original paper. The most interesting aspect of the paper is the derivation of TSC and ESCs and initial analysis suggesting that the TSCs are more sensitive to the MCM mutations, but the analysis is rather shallow. Importantly it is unclear how the phenotype explains the sex-skewing of the phenotype. Are the TSC phenotypes sex-skewed and if so why? Also, why is the JZ and especially GlyTCs most effected?

      A major concern throughout the paper is that conclusions are often overstating their data. The title of figure 2 is "placentae with replication stress have smaller junctional and labyrinth zones". However, there is no measure of replication stress in this figure, just a histological evaluation of the placentae from the different mutants. The title of figure 3 is "Impact of GIN on LZ is less than JZ," but there is no measure of GIN, but instead measurement of number of cells in cell cycle and some bulk RNA-seq analysis. Title of figure 4 is "TSCs with increased genomic instability exhibit abnormal phenotypes." Again there is no measure of GIN, but instead staining of derived TSCs for proliferation, cell death, and a TSC marker. Title of figure 5 is "DNA damage responses and G2/M checkpoint activation drive premature TSC differentiation." However, there does not appear to be a difference in gH2AX between the two mutant genotypes. Checkpoint proteins might be up, but need quantification and reproduction. > 4C is the only marker of differentiation. Importantly, all the analyses here are associations, not connections, so cannot use the word "drive". Similar issues can be raised with a number of the supplementary figures.

      Major Comments:

      1) A deeper analysis of the cell lines is likely to be the most fruitful path to reveal interesting mechanisms. It is very surprising that there is no phenotype in ESCs. Authors should check for increased apoptosis. Maybe the phenotypic cells are lost. Or do ESCs use different MCMs/mechanisms of DNA replication or are they better able to handle replication stress and GIN? How many passages were the TSCs and ESCs cultured for? Does GIN (i.e. aneuploidy, CNVs) develop in TSCs and ESCs with passaging? How do the MCM mutations impact the molecular identity of the ESC and TSC cells including their heterogeneity in the population.

      2) Along similar lines, most of the in vivo phenotypic analyses are performed at E13.5, long after defects are likely beginning to express themselves especially given that they see phenotypes in the TSCs, which represent the polar TE of a E4.5. To understand the primary defects of the in vivo phenotype, they should be looking much earlier. Supplemental figure 5 is a start but represents a rather superficial analysis.

      3) Fig. 6 would benefit from evidence that MCM3 mutant is rescuing MCM4 levels in the chromatin fraction of cells and the DNA damage phenotype.

      Minor Comments:

      1) There is a lack of quantification and repeats for all Westerns. At minimum there should be three repeats for each experiment, quantification including normalization to a reference protein, and stats confirming any proposed differences between conditions.

      2) I would recommend moving the results in supp table 1 to figure 1. While negative, they are the newer results. The results shown in current figure 1 are essentially a reproduction of their previous work.

      3) Need to review citations to figures. For example, no citations are made to figure 4a and 4c.

      Significance

      As is, the study does not provide much new insight or understanding of how the MCM mutants are driving the sex-skewed semi-lethal phenotype. It would likely take much effort (months) to reach such a goal. However, without such effort, it is unclear what the significance of the story is. It does make the observation that the placenta appears to be impacted more severely and earlier than then the embryo, and that within the placenta, certain zones and cell types are more vulnerable. The reasons for these differential impacts are unclear though.

      If the authors choose not to dig deeper as suggested in the major comments, then at a minimum it would be important to soften their conclusions as raised in the summary and at least perform experiments/edits proposed in minor comments.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In a previous paper (McNairn et al. 2019 "Female-biased embryonic death from inflammation induced by genomic instability" Science), the Schimenti lab demonstrated that mouse embryos with hypomorphic mutations of the heterohexameric minichromosome maintenance complex, mutations that cause increased genomic instability (GIN), show reduced embryonic viability, with greater loss of female embryos and some parent-of-origin effect. Treatment with immunosuppressants, including ibuprofen and testosterone, partially rescued the observed lethality.

      In this new manuscript, the Schimenti lab demonstrates that these GIN-prone mutants feature smaller placentas with fewer cells. Mutations that interfere with the ability of the innate immune system to respond to micronuclei (a consequence of GIN) have no protective effect. Munisha and colleagues then demonstrate that MCM-mutant TSCs are harder to derive and show elevated apoptosis and a greater propensity for differentiation. The mutant TSCs show CHK1 phosphorylation, P53 phosphorylation and higher P21 levels, all consistent with a response to DNA damage. Downstream of this, they also show loss and inhibition of CDK1, which is already established to cause G2/M arrest (generally) and endoreduplication (specifically in trophoblast). The authors advance a model in which GIN results in loss of the TSC pool by apoptosis, cell cycle arrest and premature differentiation, resulting in smaller placentas and particularly fewer junctional zone cells. How this causes inflammation is less clear, but inflammation appears to be a downstream effect rather than cause of poor placentation.

      Major comments:

      This is a strong manuscript with few problems and all important findings well justified, indeed this is a nicely polished manuscript for something just entering peer review. There are a few unclear points textually and a couple places in the supplementary figures where better data quality would help, but generally it is a high-quality manuscript.

      • I am confused as to the basis of the sex-skewing phenomenon? Is the problem that lack of maternally loaded WT Mcm4 worsens the phenotype, or is the issue that Mcm4C3/C3 dams are less able to retain pregnancies, perhaps being a more inflammatory environment? Also, while there quite consistent evidence for reduced viability of Mcm4C3/C3McmGt/+ progeny, especially for female progeny, how confident can we be that the genotype of the dam vs. sire is important? Notably on a Ddx58 background, the progeny of the Mcm4C3/C3 sire included seven live male Mcm4C3/C3McmGt/+ but no female.

      • I'm not sure what Supplementary Figure 6 is showing (faster differentiation of C3 but less TGC?). Regardless, it's hard to draw too much conclusion from one not-very-pretty Western blot. This figure requires both additional replicates and a better explanation of how it fits with the other conclusions of the paper..

      • Supplementary Figure 7F-G is puzzling. Half of the mESCs have gamma-H2AX at all times, including most in S or G2 phase? In Figure S7E, do the quadrants correspond to being negative or positive for gamma-H2AX? At very least, IF images showing clear gamma-H2AX foci would be much more convincing.

      • The methods section is well detailed, but it would be ideal to clarify how many replicates each Western Blot or flow cytometry experiment is representative of.

      The required additional experiments re: Supplementary Figure 6 and 7 could be conducted in a couple of months.

      Minor comments:

      • Supplementary Table 1. would be enhanced greatly showing comparable tables for Mcm4C3/C3 x Mcm4C3/+McmGt/+ in mice without the Tmem173 or Ddx58 mutations. It is fine to recycle data from McNairn 2019 here, as long as the source is indicated, but a comparison is needed.

      • Is it possible that cGAS-STING and RIG pathways act redundantly to cause inflammation and lethality, or that other innate immune components are involved? I don't expect the authors to make compound mutants to test this but at least this possibility should be discussed textually.

      • In Figure S3E-F, is the box above each graph supposed to show the genotype of the dam?

      • "Indeed, the placenta and embryo weights of E13.5 Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ animals were significantly improved vs. Mcm4C3/C3 Mcm2Gt/+ animals, rendering them similar to Mcm4C3/C3 littermates (Fig. 6A-C). The JZ (but not LZ) area in Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ placentae also increased to the level of Mcm4C3/C3 littermates (Fig. 6D-H)." There are two problems here. First, the figure calls are wrong. Second, the description of the data is not quite right, it looks like the C3/C3 and C3/C3 M2/+ M3/+ LZs are a similar size to each and are statistically indistinguishable.

      Significance

      I partially discussed the above in the summary, but this paper makes a major breakthrough, showing that cell autonomous defects in hTSCs are very likely at the heart of the pathology observed in GIN-prone murine mutants.

      Some questions go unsolved. Why are TSCs more prone to die in response to GIN than mESCs, particularly in light of the general observation that karyotypic abnormality is more common in placental lineage? How does the placental abnormality give rise to inflammation? No manuscript can answer every question, and I think this is a mature manuscript that can be published in a good journal with limited modifications.

      I am an expert on gene regulation in placental development, with somewhat less expertise in the DNA damage field. The placenta field will find this paper interesting, as will the DNA damage field. There are also ramifications for cancer research. The question of why some cells tolerate high levels of DNA damage and others die is very relevant to cancer.

  3. Sep 2025
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      SUMMARY

      In this study, Fernandes and colleagues addressed the question of the role of micro-RNAs in regulating the coupling between organ growth and developmental timing. Using Drosophila, they identified the conserved micro-RNA miR-184 as a regulator of the developmental transition between juvenile larval stages and metamorphosis. This transition is under the control of the steroid hormone Ecdysone, and has been shown to be modulated in case of abnormal tissue growth to adjust the duration of larval growth in response to developmental perturbations. The relaxin-like hormone Dilp8 has been identified as a key secreted factor involved in this coupling. Here, the authors show that miR-184 is involved in the regulation of Dilp8 expression both in physiological conditions and upon growth perturbation. They propose that this function is carried out in imaginal tissues, where miR-184 levels are modulated by tissue stress. While several factors have already been involved in triggering sharp dilp8 induction at the transcriptional level, this study adds another level of complexity to the regulation of Dilp8 by proposing that its expression is fine-tunned post-transcriptionally through repression by miR-184.

      __MAJOR COMMENTS______

      Overall, the manuscript is well organized, and the logics of the experimental plan well presented. The results are clear, and I appreciate the quality of the pupariation curves. However, I believe that two main conclusions of the paper are not fully supported by the results presented in the figures: the direct regulation of dilp8 3'UTR by miR-184, and the specificity of this regulation in imaginal discs. Here I develop in more details these two aspects.

      Comment 1) The strategy of the 3'UTR sensor is not fully optimized. Indeed, in most experiments, qRT-PCR is used to assess dilp8 expression levels, although it reflects both transcriptional and post-transcriptional. Importantly, to show that post-transcriptional regulation is involved in the response to tissue damage, the levels of the 3'UTR sensor should be analyzed in discs expressing RAcs (showing at the same time that the response is cell-autonomous in the discs). The expected upregulation of the sensor should be prevented by simultaneous expression of miR-184. This approach would shed light on the relative contribution of transcriptional versus post-transcriptional regulation of dilp8 in response to growth perturbation.

      Response: We thank the reviewer for this comment. We agree that qRT-PCRs do not distinguish between transcriptional and post-transcriptional changes of dilp8 levels, in response to changes in miR-184 levels and tissue damage. In addition to the qRT-PCR data we have looked at dilp8-3’UTR-GFP reporter in response to overexpression of miR-184 in the wingdisc using patched-Gal4 driver, which show downregulation of the GFP reporter in the ptc domain (Fig 4C-D’). This suggests that dilp8 mRNA is a direct target of miR-184 by post-transcriptional regulation through its 3’UTR. Further, to confirm the specificity of the effect of miR-184 on dilp8-3’UTR, we generated a dilp8-3’UTR mutant in which the single target site for miR-184 was mutated. We show that the mutated dilp8-3’UTR reporter doesn’t show any regulation in response to miR-184 overexpression in the ptc domain of the wingdisc (Fig. 4E, E’, F, F’). This experiment confirms the specificity of the dilp8-3’UTR regulation by miR-184.

      As suggested by the reviewer we analysed dilp8-3’UTR-GFP reporter expression by overexpressing RicinA using ptcGAL4 driver in the wing imaginal disc (Fig. S6F-G’). We observed a slight but consistent increase in the dilp8-3’UTR-GFP reporter expression, indicating post-transcriptional regulation of dilp8 expression in response to tissue damage. However, the increase of reporter GFP levels observed in this experiment in response to tissue damage is mild (Fig. S6F-G’) than expected based on the qRT-PCR results (Fig S6A and B). We have added this new data to the manuscript (Fig. S6F-G’).

      We propose the following reasons to explain this result:

      a) both transcriptional and post-transcriptional regulation of dilp8 mRNA in response to developmental perturbations

      b) the data on 3’UTR reporter GFP is specifically from the ptc domain expression of RicinA, whereas for dilp8 transcript levels we have expressed RicinA in all larval imaginal tissues, or in the entire wing imaginal disc, which could be one of the reasons for the stronger effect seen on dilp8 mRNA levels

      c) we are not certain if the tubulin-promoter driven dilp8-3’UTR GFP reporter reflects post-transcriptional regulation of dilp8 by miR-184 efficiently in comparison to qRT-PCR. This is especially as the reporter-GFP-3’UTR will be expressed at very high levels due to the tubulin promoter, a majority of this reporter-GFP mRNA may not be relieved from degradation due to the moderate suppression of miR-184 in response to RicinA overexpression.

      Thus, our experiments suggest that dilp8 levels are regulated post-transcriptionally by miR-184 which contributes to pupariation delays in response to tissue damage. In support of this, we could rescue pupariation delays and dilp8 induction caused by RicinA expression using overexpression of miR-184 (Figs 5B, C). Thus, we confirm that the effect of post-transcriptional regulation by miR-184 during developmental perturbations also contributes to dilp8 induction and pupariation delays. Unfortunately, due to experimental limitations we could not perform simultaneous expression of RicinA and miR-184 to evaluate the rescue of dilp8-3’UTR-GFP sensor expression. The levels of dilp8-3’UTR sensor GFP is reduced efficiently by miR-184 overexpression (Fig 4D), which prevented us from attempting the rescue of the moderate increase of dilp8-3’UTR GFP levels in response to RicinA.

      Comment 2) In my opinion, the use of a 3'UTR sensor is not sufficient to conclude that the regulation by miR-184 is direct, as miR-184 could also regulate an intermediate factor that acts on dilp8 post-transcriptional regulation. To solve this issue, a common strategy is to generate a 3'UTR sensor with mutated binding sites that should abolish the regulation by miR-184. This mutated 3'UTR might also respond differently to tissue damage, which would strongly support the conclusions of the study.

      Response: We couldn’t agree more with the reviewer, this comment is addressed in the response to comment 1. We have confirmed the specificity of regulation of dilp8-3’UTR by miR-184 using target site mutated dilp8-3’UTR (new figures added to the manuscript Fig. 4E, E’, F, F’). We tested if the changes in dilp8 mRNA levels in response to tissue damage is post-transcriptional mediated by miR-184. We observe that there is a slight, but consistent increase of dilp8-3’UTR GFP reporter levels in the ptc domain of wingdisc in response to RicinA expression, suggesting a role for miR-184 mediated post-translational regulation of dilp8. However, we have not yet tested the mutated dilp8-3’UTR GFP reporter in response to tissue damage.

      Comment 3) Concerning the tissue-specific regulation of Dilp8 by miR-184, these results need to be strengthened. Indeed, this comes mostly from phenotypes observed with rn-GAL4. Although this is a classical tool for driving expression in imaginal discs, rn-GAL4 also drives strong expression in other tissues that could contribute to triggering a delay, such as the CNS and part of the gut (proventriculus). In our hands, some growth phenotypes in the wing obtained with rn-GAL4 could be fully reverted by blocking GAL4 in the CNS indicating that the phenotype was not wing-specific. Importantly, miR-184 seems to be highly expressed in the CNS according to FlyBase, reinforcing the possibility that it plays a role in this organ. Here I propose approaches to confirm that miR-184 mediated regulation of dilp8 and developmental timing indeed occur in the discs:

      - Another driver with less secondary expression sites could be used (pdmR11F02-GAL4), or rn-GAL4 could be combined with an elav-GAL80 to prevent expression in most neurons. - The authors could identify the source of Dilp8 upregulation in miR-184 mutants using tissue-specific qRT-PCR instead of whole larvae expression like in Fig 4A-B. - This tissue-specific upregulation could be functionally tested using a rescue experiment, in which the delay observed in miR-184 mutants could be rescued by disc-specific downregulation of Dilp8 (using pdm2-GAL4 for instance).

      Response: We are thankful to the reviewer, and agree that it is important to show that the effects that we see using rn-Gal4 are specific to imaginal discs, and not due to an effect in CNS. We tested this by expressing miR-184 sponge in the CNS. Though miR-184 is highly expressed in the larval CNS, downregulation of miR-184 specifically in the pan-neuronal background using elav-GAL4 led to no effects on pupariation timepoint. We have added this as supplementary data Figure S4. Therefore, we believe that the miR-184 downregulation phenotype in the rnGAL4 background can be mainly attributed to its role in the imaginal discs. In addition, as suggested by the reviewer we have also demonstrated that downregulation of miR-184 in the imaginal discs using rnGAL4 driver leads to an increase in dilp8 expression (Fig S5B). Thus confirming that dilp8 mRNA is enhanced in the imaginal discs by blocking miR-184.

      OPTIONAL: Because it is known that dilp8 is strongly regulated at the transcriptional level, the relative input from post-transcriptional upregulation is an important question arising from this study. Although it might be a more long-term approach, I believe that generating a Dilp8 mutant lacking its 3'UTR or, even better, with mutated miR-184 binding sites, would shed light on the role of this regulation for the response to growth perturbation and/or developmental stability (fluctuating asymmetry).

      Response: We thank the reviewer for the suggestion. This would have been an interesting experiment to carry out especially in the context of fluctuating asymmetry.

      MINOR COMMENTS

      1. __ I think that a number of results could be moved to SI as they are either controls, or reproduce published data without bringing novelty. For instance, results in Fig 5A-D are similar to data published by Sanchez et al, as stated in the text. Fig6A as well.__

      __Response: __We thank the reviewer for this suggestion, Fig. 5A-D, and F has been moved to Fig. S6A-E. We have also moved data from Fig. 6 to Fig. 5, as a result Fig 6 A-D has become Fig. 5 B-D.

      __ Fig 6D is quite mysterious, as it suggests that basal JNK activation regulates miR-184, which is different from a context of tissue damage. I think that this result could be removed. Alternatively, if the authors want to dig in that direction, more experiments should be provided, such as bskDN expression in an RAcs context and the effects on miR-184 levels and the 3'UTR sensor (since transcript levels are already published).__

      Response: We would like to clarify that our experiments suggest that endogenous JNK signalling negatively regulates miR-184, as blocking basal JNK signalling using bskDN increased the levels of miR-184 (changed to Fig 5D). Enhanced JNK signalling has been reported to be involved in tissue damage responses, and we propose that RicinA mediated increase in JNK signalling leads to the reduction of miR-184 (changed to Fig 5A, S6D-E). However, we are not strongly implying this as we did not co-express RicinA and bskDN to show that JNK signalling is responsible for the drop in miR-184 levels in response to tissue damage. We thank the reviewer for seeking this explanation, we have rewritten the results section to improve clarity.

      __ The references related to Dilp8 should be checked more in detail in the intro and discussion. About Dilp8 and developmental stability: remove the ref to Colombani et al 2012, instead put Boone et al 2016 and add Blanco-Obregon et al 2022 (in addition to Garelli et al 2012 who initially identified this phenotype. About Lgr3 as the receptor for Dilp8: add Colombani et al, Current Biology 2015, and cite here Vallejo et al 2015, Garelli et al 2015. Among the important transcriptional regulators of Dilp8, Xrp1 could be mentioned (Boulan et al 2019, Destefanis et al 2022) as it plays a complementary function to JNK depending on the type of tissue stress.__

      __Response: __We are really sorry for the glaring errors in citing appropriate references. We thank the reviewer for correcting this for us. We have made necessary changes to the text.

      Significance

      GENERAL ASSESSMENT This study provides convincing data showing that the conserved microRNA miR-184 plays a role in regulating developmental timing in Drosophila through modulating the levels of Dilp8, a key factor in the coupling between tissue growth and developmental transitions. The results are convincing, but the general conclusions of the paper need to be strengthened regarding the direct regulation of dilp8 by miR-184 and the tissue-specificity of this interaction.

      ADVANCE Dilp8 is a key factor that modulates growth and timing in response to developmental perturbations and contributes to developmental precision in physiological conditions. As such, its regulation has been studied by different groups in the last decade, leading to the identification of several inputs for its transcriptional regulation. Here, the authors uncover a post-transcriptional regulation by miR-184, adding another level of regulation of Dilp8 that contribute to ensuring proper regulation of developmental timing, and opening the possibility that miR-184 might play similar roles in other species.

      AUDIENCE This study is of interest for researchers in the field of basic science, with a focus on developmental timing, tissue damage and biological function of microRNAs.

      REVIEWER EXPERTISE Drosophila, growth control, developmental timing, Dilp8.

      Reviewer #2

      Evidence, reproducibility and clarity

      Drosophila has helped to characterize the mechanisms that coordinate tissue growth with developmental timing. The insulin/relaxin-like peptide Dilp8 has been identified as a key factor that communicates the abnormal growth status of larval imaginal discs to neuroendocrine neurons responsible for regulating the timing of metamorphosis. Dilp8, derived from imaginal discs, targets four Lgr3-positive neurons in the central nervous system, activating cyclic-AMP signaling in an Lgr3-dependent manner. This signaling pathway reduces the production of the molting hormone, ecdysone, delaying the onset of metamorphosis. Simultaneously, the growth rates of healthy imaginal tissues slow down, enabling the development of proportionate individuals.

      In this manuscript "miR-184 modulates dilp8 to control developmental timing during normal growth conditions and in response to developmental perturbations" by Dr. Varghese and colleagues, the authors identify a new post transcriptional regulator of Dilp8. The authors show that miR-184 plays a pivotal role in tissue damage responses by inducing dilp8 expression, which in turn delays pupariation to allow sufficient time for damage repair mechanisms to take effect.

      Major points:

      Comment 1) In most of the experiments for percentage of pupariation, the 50% pupariation in control is around 110 hours AED in figures 1, 2 and 3. In figures 5 and 6 using the UAS Ricin, the controls are more around 90 hours AED. Why this discrepancy?

      Response: We thank the reviewer for asking for this clarification. The former experiments for Figs 1-3 were carried out at 25oC while the latter experiments with a cold sensitive version of RicinA (UAS-RAcs), Figs 5 and 6 (now changed to Figs. 5 and S6 as suggested by reviewer #1) were carried out at 29oC (permissive temperature). This difference in temperature has led to alterations in pupariation timing. We apologise for not having mentioned this in the text, now we have made necessary corrections to the methods section clearly indicating this.

      Comment 2) What is the mechanism behind the expression of miR-184 in stress conditions? Is miR-184 also implicated in other conditions giving rise to a developmental delay (X-rays irradiation or animal bearing rasV12, scrib-/- tumors)?

      Response: We thank the reviewer for these questions.

      a) In response to developmental perturbations by RicinA, we believe that activation of JNK signalling controls miR-184 expression. We propose this as our experiments show that imaginal disc damage leads to enhancement of JNK signalling and increase in dilp8 mRNA levels (as reported earlier by Colombani et al 2012; Sánchez et al 2019), and a simultaneous reduction of miR-184 (Figs. S6A, D, E). We also have performed new experiments to show that in response to RicinA expression in the wingdisc there is moderate increase in the dilp8-3’UTR-GFP sensor expression (Figs. S6F-G’), indicating a post-transcriptional regulation of dilp8 expression in response to tissue stress. We also show that RicinA induced dilp8 expression and pupariation delay can be rescued by increasing miR-184 levels (Fig 5B and C), suggesting that the reduction of miR-184 in response to tissue damage contributes to the damage responses. In a separate experiment we show that blocking the endogenous JNK pathway by the expression of bskDN enhances miR-184 levels, suggesting that miR-184 is under the regulation of JNK signalling (Fig 5D). Hence, we speculate that during tissue stress, activation of JNK signalling leads to a reduction of miR-184 levels which contributes to regulating the levels of dilp8 post-transcriptionally and resulting in pupariation delays. The text has been modified to explain this better.

      b) In a previous paper by Shu et al., 2017 (https://doi.org/10.18632/oncotarget.22226) decreased expression of miR-184 was observed in a lglRNAi; RasV12 tumor background. Apart from this various studies have shown that dilp8 levels increase in response to tumour, radiation stress, apoptosis, and tissue damage (Yeom et al 2021, Ray et al 2019, Demay et al 2014, Katsuyama et al 2015, Colombani et al 2012, Garelli et al 2012). Whether the regulation of dilp8 by miR-184, occurs in these backgrounds is yet to be tested. We have now discussed this possibility in the manuscript.

      Comment 3) dilp8 mutant animals have also been shown to be more resistant to starvation or desiccation (https://doi.org/10.3389/fendo.2020.00461). Is miR-184 implicated in this answer?

      Response: We thank the reviewer for this question. In our earlier experiments miR-184 has been demonstrated to be regulated by nutrition in the larval stages and lack of miR-184 led to enhanced larval death in response to diet restriction (Fernandes et al., 2022). miR-184 was also demonstrated to play a role in the insulin producing cells (IPCs) in regulating lifespan (Fernandes & Varghese., 2022). In the current work, we propose miR-184 to act upstream of dilp8 in response to stress stimuli. Hence, it is possible that miR-184 might be involved in responses to starvation and desiccation stress in the adult female flies, by regulating dilp8 levels post-transcriptionally. However, it has not been tested yet if the miR-184 regulation of dilp8 plays a role in resistance to starvation or desiccation in adult females, as this was not within the scope of the current study. We have now added this reference in the discussion section.

      Comment 4) dilp8 expression has been also shown to be regulated by Xrp1 in response to ribosome stress (https://doi.org/10.1016/j.devcel.2019.03.016). This paper should be included in the manuscript. Is it possible that the expression levels of miR184 are regulated by Xrp1?

      Response: We thank the reviewer for the suggestion and have incorporated the reference into the paper. During ribosome stress in the larval imaginal discs the stress-response transcription factor Xrp1 acts through dilp8 in regulating systemic growth. We agree with the reviewer, it is possible that expression of miR-184 is regulated by Xrp1. Currently we have not explored this possibility. We have now added this to the discussion section.

      Minor points:

      1. __ Does the overexpression of miR184 induce an increased fluctuating asymmetry?__

      Response: We thank the reviewer for asking this question. The role of dilp8 in the fluctuation asymmetry is only observed in the dilp8 hypomorphic mutant background. To replicate this we would have to overexpress miR-184 in either the whole larvae or in the wing discs. Unfortunately overexpression of miR-184 in the wing discs (using rnGAL4) leads to pupal lethality while as overexpression of miR-184 in the whole larvae leads to embryonic lethality and therefore we were not be able to conclude from our experiments if miR-184 overexpression induces increased fluctuating asymmetry.

      2. There are 2 references Colombani et al. (2012 for Dilp8 and 2015 for Lgr3). Can you double check that they are used accordingly

      Response: We thank the reviewer for pointing these errors out and we have incorporated these changes into the paper.

      Significance

      Altogether, the paper present compiling lines of evidence supporting the proposed model. The experiments are well designed and are convincing. The papers is interesting and relevant for a broad audience.

      __Reviewer #3 __

      Evidence, reproducibility and clarity (Required):

      This is an interesting study demonstrating an interaction between miR-184 and the Drosophila insulin-like peptide 8 (dilp8) in the tissue damage response. The authors show that Dilp8 activity is negatively regulated by miR-184, apparently through direct interaction between miR-184 and the dilp8-3'UTR, which leads to lower dilp8 mRNA transcript levels, via an undetermined mechanism, supposedly its degradation? Furthermore, the authors show that during aberrant tissue growth, miR-184 levels are very slightly downregulated (see comment below), and based on other experiments, imply causation of this with the increased dilp8 mRNA levels that occur in these tissues, again via an unclear mechanism: upregulation or stabilization of dilp8 mRNA. The authors present evidence that the JNK pathway, which had been known to be critical for dilp8 mRNA upregulation upon tissue damage, does so via miR-184.

      Major Comments:

      __Comment 1: The data showing the direct regulation of dilp8-3'UTR by miR-184 are not very strong and would require more controls to strengthen the claim, as described below. __

      Response: We have performed new experiments to validate that dilp8-3’UTR is regulated by miR-184. Please see the detailed responses to comments 10-12 below.

      __Comment 2: The miR-184 effects are also very small (less than 2-fold reduction with tissue damage; or less than 2-fold induction with JNK-pathway inhibition via bskDN). These two points are the weakest part of the manuscript and model. __

      Response: We agree with the reviewers on this point. The reduction in miR-184 levels in response to RicinA expression is modest (25–30%), and the induction of miR-184 in response to bskDN expression is less than two-fold (Figs. 5A and D). In contrast, dilp8 transcript levels increase several-fold in response to RicinA expression (Fig. 5C, S6A and B). Since we measure dilp8 transcript levels by qPCR, we detect both transcriptional and post-transcriptional contributions to dilp8 regulation. In addition, we have performed a new experiment to check the post-transcriptional regulation of dilp8, in response to tissue damage. Though the change in the dilp8-3′UTR GFP reporter upon RicinA expression in the ptc domain of the wingdisc is mild (Figs. S6F-G’), this strongly suggests a post-transcriptional outcome of the reduction of miR-184 levels on dilp8. Hence, we propose that tissue damage induces strong transcriptional activation of dilp8, while the reduction of miR-184, despite its smaller magnitude, contributes to dilp8 upregulation via post-transcriptional regulation. In support of this, our experiments demonstrate direct regulation of the dilp8-3′UTR by miR-184 (Figs. 4C-F’), and show strong dilp8 mRNA upregulation in miR-184 deficient conditions (Fig. 4A and B), suggesting the role of miR-184 in maintaining dilp8 levels. We also show that RicinA induced effects on dilp8 and pupariation delay are reversed by co-expression of miR-184 (Fig. 5C). We do not claim that regulation by miR-184 is the sole mechanism for driving dilp8 induction during tissue damage, but suggest that miR-184-mediated post-transcriptional regulation acts in a complementary manner to transcriptional responses. Furthermore, we believe that the mild effect of JNK signaling on miR-184 (as shown by the bskDN experiment) is sufficient for the moderate reduction of miR-184 in response to tissue damage.

      Comment 3: ____Regarding the expression levels, it does not help that the authors show bar graphs with standard errors of the mean instead of the actual data points to allow reliable appreciation of the data dispersion.

      Response: We have modified our figures and have performed statistical analysis according to the suggestions of the reviewers, please see responses to comments 1-9, and 13-19.

      Comment 4: It is difficult to understand how minute changes in miR-184 levels can lead to over an order of magnitude differences (in some cases) in dilp8 mRNA levels considering that it is a stoichiometric relationship. Maybe ?miR-184-Dicer1? complexes are highly stable and re-used for multiple dilp8 transcripts - the authors could discuss how they understand this occurring in their manuscript.

      On the same line, discussion is also rather weak on what regards the mechanism of control of dilp8 mRNA levels by miR-184. Please discuss eg, the evidence for mRNA degradation induction by microRNAs with this UTR binding profile (imperfect UTR binding Fig S4) and-if appropriate-how other possible regulatory models (direct and indirect) could explain the findings.

      Response: We accept the reviewers comment that 25-30% reduction of miR-184 is low in comparison to the many fold increase in dilp8 levels. We believe that both post-transcriptional and transcriptional changes are responsible for the induction of dilp8 in response to tissue damage. However, our experiments suggest the role of post-transcriptional regulation by miR-184, as pupariation delay is rescued by miR-184 overexpression (also please see the response to the previous comment). We are not ruling out the possibility of transcriptional regulation of dilp8 mRNA, rather we are suggesting the possibility that both transcriptional and post-transcriptional means are responsible for changes in dilp8. Moreover, we have not performed absolute measurement of miR-184 in the imaginal discs (what we show is a comparison between control and RicinA expression), hence we do not have an exact estimate of how many miR-184 molecules are reduced and if they would be greatly equal or more in comparison to the dilp8 mRNA molecules that are upregulated, as again while measuring dilp8 mRNA we are not checking how many molecules of dilp8 exactly are increased. As the reviewer suggests, it is possible that miR-184-RISC could be stable to handle multiple dilp8 molecules one after the other, hence it is not a 1:1 relationship between miR-184:dilp8. We have included this in the manuscript. It is also known that imperfect 3’UTR binding as seen in most animal microRNAs leads to translational repression and mRNA deadenylation, which eventually results in mRNA degradation.

      Comment 5: ____We suggest the authors carefully revise their citations to cite appropriate work that supports the claims, and also to avoid missing the seminal studies that report the claims they cite.

      Response: We are really apologetic for the errors citing the key references. We are grateful to the reviewers for correcting this for us. We have made changes to the text to include and correct the references.

      We have the suggestions below which we hope will help the authors improve their manuscript. If the authors address these points raised above, we believe the manuscript should be a valuable contribution to the field, and help in the understanding of how tissues respond to growth aberrations and the regulation of transcript levels by microRNAs.

      Detailed Comments:

      Comment 1. Results 1st paragraph: please describe the screen in more detail. As written, one only discovers it was a miRNA loss-of-function screen when reading the legend of Table S1. Please show the original data of the screen - with dispersion if possible.

      Response: We thank the reviewers for these suggestions, we have now included the data from the screen with SEM, and p-values.

      Comment 2. Results 1st paragraph, Fourth line, "While several miRNAs caused delays in pupariation by 12 hours or more..". Please correct, as actually loss of miRNAs caused delays.

      Response: We thank the reviewer for pointing out this error, we have corrected the text accordingly.

      Comment 3. ____Results (Figure 1) - It says that data from three independent experiments are shown. However there is no dispersion in the data. Could the authors please explain this? Are the results of the three experiments summed and presented as one? or is this one of the three?

      Response: We thank the reviewers for these suggestions and have plotted data with the SEM values.

      Comment 4. It is reported in the legend of Figure S2 that LogRank test was performed to determine statistical significance. However, no statistical data is presented. Please show the results.

      __Response: __We thank the reviewers for these suggestions to improve the data presentation, we have incorporated the p-value as suggested.

      Comment 5. Fig2A and B. Please show the data points in the bar graphs (as in Figure. 2C), or choose another data representation. ____Please consider redoing statistical analysis with a simple t-test. ____It is not clear to me why ANOVA was used to compare two samples. Please state that data are normalized also to control (tub-GAL4>UAS-scramble). Please ____state____ the h post-hatching from which the RNA samples were collected (as in Fig 2C for 20HE quantification).

      __Response: __We thank the reviewers for these suggestions to improve the data presentation, we have incorporated all changes as suggested. Similar changes have been incorporated to the rest of the figures of the manuscript as well. Hours post-hatching information for each figure is now added to the figure legends. __ __

      Comment 6. Fig2C. Fig legend states the bar graphs are "absolute values". Please specify if the bar represents the average, median or something else.

      Response: We thank the reviewer for pointing this out, we have made the suggested changes.

      Comment 7. Throughout the manuscript: please use GAL4 in capital letters or at least standardize it throughout the ms. Currently there are GAL4s and Gal4s.. eg compare Fig 2 and 3 legends.

      Response: We thank the reviewer for pointing this out, we have incorporated all changes as recommended.

      Comment 8. FigS3A and B. Please revise as Fig2A and B above. and apply the same criteria in the respective figure legend.

      __Response: __We thank the reviewer for pointing this out, we have made the changes as recommended.

      Comment 9. Fig. 4 - please indicate on the figures what is whole larvae and what is wing imaginal discs. This will facilitate understanding of the figure.

      __Response: __We thank the reviewers for these suggestions and have included this information in all the figures.

      Comment 10. Fig 4 - Data - Authors do not show that rn-GAL4>miR-184-sponge causes up regulation of dilp8 mRNA levels, hence the model is weakened. Doing this experiment would significantly strengthen the study whatever the result is.

      Response: We thank the reviewer for pointing this out and we have included this in the manuscript (Fig S5B).

      Comment 11. The dilp8-3'UTR experiment is weak especially because its generation is not sufficiently well described in the manuscript. "The dilp8 3'UTR-GFP reporter line was created as described in (Vargheese & Cohen, 2007)" is not sufficient. Please describe the construct generation in sufficient detail so that the experiments can be reproduced by others.

      Response: We thank the reviewer for pointing this out and we have elaborated in the methods section on how we generated the dilp8 3'UTR-GFP reporter and dilp8 3'UTR mutant GFP reporter lines. The plasmid was originally created in Steve Cohen’s lab at EMBL, by modifying pCasper4 plasmid, by introducing a tubulin promoter, EGFP and a multiple cloning site, which allows one to clone 3’UTRs of target genes into this plasmid. Not1 and Xho1 sites were used to clone the dilp8-3’UTR and mut-3’UTR. We hope this explains our strategy sufficiently.

      Comment 12. Making assumptions, if the construct is as described in Vargheese & Cohen, 2007 and contains all of the dilp8 3'UTR - it should be a Tubulin-driven GFP gene with a dilp8-3'UTR "Tub-GFP-(dilp8 3'UTR)". In this case the authors need to rule out the alternative interpretation of the result in Fig. 4D by showing that the expression of miR-184 does not down regulate Tub-GFP expression itself. The best scenario would be to have a mutated dilp8 3'UTR for the miR-184 recognition site. This experiment would significantly strengthen the study and model.

      Response: We thank the reviewer for pointing this out. We agree with the reviewers that this experiment is needed to prove direct regulation of the dilp8-3’UTR by miR-184. We have mutated the sequences complementary to the seed region of miR-184 in the dilp8-3’UTR, and demonstrated that overexpression of miR-184 does not regulate the mutated tub-GFP-(dilp8 3'UTR) expression. This confirms that the dilp8 gene is a direct target of miR-184. This data is added to the manuscript as Figs 4E-F’.

      Comment 13. Figure 4C-D please separate dilp8 from 3'UTR with a space or hyphen.

      Response: We thank the reviewer for pointing this out and have separated dilp8 from 3’UTR with a hyphen.

      Comment 14. Figure 4E. Please name the dilp8 allele as MI00727 as it is not a KO, but rather a hypomorphic mutation (fully WT dilp8 transcripts are still generated, albeit at a much lower level).

      Response: We thank the reviewer for pointing this out and we have made the necessary changes.

      Comment ____15. Figure 6D: please add UAS to bskDN/+. All figures have rn-GAL4 alone or with UAS-GFP as control. This finding would be strengthened with this other control, especially because the size effect is small.____ This being said a general comment for all experiments is that hemi-controls are generally missing for all figures. eg, in Fig 3. One would typically include controls such as A. Phm>+ and +>miR.184; B. aug21>+ and +>miR.184; C. ptth>+ and +>miR.184; D. rn>+ and +>miR.184

      Response: We thank the reviewer for pointing this out. We have added UAS to bskDN, now Fig 5D and have also added the rnGAL4/+ control. We have also performed various hemi-control experiments as suggested by the reviewer to our best capabilities. We have added a separate graph with the hemicontrols in the as a Reviewer Response Figure 1.

      Comment 16. Figure 7: Are IPCs necessary for the model? If not, I suggest removing them and placing the Lgr3 neuron cell bodies much more anterior in this scheme. Their cell bodies are as anterior and rostral as it gets, approximately where the IPCs are depicted in this type of view of the CNS.

      Response: We thank the reviewer for pointing this out and have removed IPCs from the figure, this figure is now labelled as Fig. 6.

      Comment ____17. Table S1- It would be preferable to see the data of these experiments, but if the authors prefer to show this data in a table, please at least add the dispersion analyses (eg standard deviation.. OR median+-quartiles OR Confidence intervals..), N of animals analysed, and statistics against controls.

      Response: We thank the reviewer for pointing this out, we have added the number of larvae analysed, SEM values and statistics against the control condition.

      Comment ____18. In all figures with pupariation time: please also indicate significant findings in the graphs (with an asterisk, for instance) and adjust figure legends accordingly. This could facilitate understanding the data.

      __Response: __Thanks for the suggestion. We have incorporated this information into figure legends.

      Comment ____19. Please revise Figure legends for punctuation.

      __Response: __We have rectified all the errors in punctuation. We thank the reviewers for suggesting this.

      __Comment ____20. __

      a) Abstract:

      Line 10: What is the evidence to call Dilp8 a "paracrine" factor?

      Response: We thank the reviewer for pointing this out, we have changed the text to ‘secreted factor’.

      b) Introduction:

      4th paragraph, 3rd sentence " Dilp8... buffers developmental noise and delays pupariation..." Buffering of developmental noise was first shown in Garelli et al., Science 2012, so this publication should be cited. ____4th paragraph, 5th sentence: please include Jaszczak et al., Genetics 2016. This paper was published together with the 2015 papers, just a matter of timing that it got a 2016 date. Moreover, I do not think Katsuyama et al., 2015 is well cited to back up the statement in this sentence, hence I recommend removing that citation in this sentence.

      Response: We thank the reviewer for pointing this out and have made necessary changes.

      c) 6th paragraph: 5th line "targeting dilp8" : please specify if you mean the gene or the mRNA, or both. Same for line 7.

      Response: We thank the reviewer for pointing this out and have made necessary changes.

      d) Results Page 10, 1st paragraph, 1st sentence: the works cited are not the appropriate studies that demonstrated what is being stated. This was shown in Garelli et al., Science 2012 and Colombani et al., Science 2012. Results Page 10, 1st paragraph, line 11: Please also cite Colombani et al., Science 2012, who first showed that JNK is required for dilp8 regulation.

      Response: We thank the reviewer for pointing this out and are extremely apologetic for this oversight. We have made necessary changes to the manuscript.

      e) Discussion, 2nd paragraph, line 4: again, please indicate the rationale for using "paracrine" to describe Dilp8's activities. The current widely accepted model is that Dilp8 acts on interneurons in the brain ____(eg, reviewed in Juarez-Carreno et al., Cell Stress, 2018; Gontijo and Garelli, Mech Dev, 2018; Mirth and Shingleton, Front Cell Dev Biol, 2019; Texada et al., Genetics 2020; Boulan and Leopold, 2021).____ In order to reach the brain, Dilp8 has to be secreted from the discs and travel to the brain. This is as an endocrine mechanism as it gets for a small larva, considering that some discs can be on the opposite side of the larva (eg, genital discs). While this does not exclude that Dilp8 could also act paracrinally, the only evidence that I am aware of comes from other contexts such as during transdetermination (where Dilp8 has been proposed to work in an autocrine or paracrine fashion, via Drl in imaginal discs (Nemoto et al., Genes to Cells, 2023), however, this is not cited appropriately in this manuscript and is less related to the Lgr3-dependent pathway being studied here.

      Response: We totally agree with the reviewer and appreciate clarifying this for us. We have made necessary changes to the text.

      f) Discussion Page 13, 1st paragraph, This claim is supported by data presented in Garelli et al., Science 2012, not the other two papers. Garelli et al., 2015 shows that the Lgr3 receptor also participates in buffering developmental noise. Other studies have corroborated the Garelli et al., 2012 finding: eg, Colombani et al., Curr Biol 2015; Boone et al., Nat Commun 2016; Blanco-Obregon et al., Nat Commun 2022). Many other studies have shown that Dilp8 promotes developmental stability under tissue stress and challenges.

      Discussion Page 12, 3rd paragraph, 2nd sentence: "The Lgr3 neurons directly interact with ... PTTH ...and insulin-producing neurons" Please cite Colombani et al., 2015 and Vallejo et al., Science 2015. Vallejo et al., propose that circuit with insulin-producing neurons. In the 3rd sentence, only Jaszczak et al., 2016 is cited, whereas this claim/model comes from many studies, such as Halme et al., Curr Biol, 2010; Hackney et al., PLoS One 2012; Garelli et al. Science 2012; Colombani et al., Science, 2012; and the Lgr3 papers from 2015). Jaszczak et al., actually propose that Lgr3 is also required in the ring gland in addition to neurons.

      Discussion page 14 last paragraph,10 line, "In Aedes aegypti ....regulates ilp8 (Ling et al., 2017)". As far as I understand mosquitoes do not have a dilp8 orthologue (see for instance Gontijo and Gontijo, Mech Dev 2018; and Jan Veenstra's work). ilp nomenclature (numbering) does not follow that of Drosophila, so ilp8 is probably a typical Insulin/IGF-like peptide and is NOT an orthologue of Dilp8, a relaxin, so this citation needs to be removed or placed into the broader context of microRNA regulation of ilps.

      Response: We are really sorry for the numerous glaring errors in the references. We thank the reviewers for correcting this for us. We have made necessary changes to the text.

      Thank you for the opportunity to review your interesting work,

      Alisson Gontijo and Rebeca Zanini

      Reviewer #3 (Significance (Required)):

      If the authors address these points raised above, we believe the manuscript should be a valuable contribution to the field, and help in the understanding of how tissues respond to growth aberrations and the regulation of transcript levels by microRNAs.

      __Author’s concluding response: __

      We thank all the reviewers for the overall positive comments and suggestions that we believe have helped us to improve our manuscript. We have incorporated all the changes suggested, especially regarding errors in citing key references. We have performed most of the experimental suggestions. Also, we have modified the way in which graphs are presented, including statistical tests as suggested by the reviewers. Several controls have been performed to strengthen the manuscript further. We believe that this review process aided in significantly improving this manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is an interesting study demonstrating an interaction between miR-184 and the Drosophila insulin-like peptide 8 (dilp8) in the tissue damage response. The authors show that Dilp8 activity is negatively regulated by miR-184, apparently through direct interaction between miR-184 and the dilp8-3'UTR, which leads to lower dilp8 mRNA transcript levels, via an undetermined mechanism, supposedly its degradation? Furthermore, the authors show that during aberrant tissue growth, miR-184 levels are very slightly downregulated (see comment below), and based on other experiments, imply causation of this with the increased dilp8 mRNA levels that occur in these tissues, again via an unclear mechanism: upregulation or stabilization of dilp8 mRNA. The authors present evidence that the JNK pathway, which had been known to be critical for dilp8 mRNA upregulation upon tissue damage, does so via miR-184. The data showing the direct regulation of dilp8-3'UTR by miR-184 are not very strong and would require more controls to strengthen the claim, as described below. The miR-184 effects are also very small (less than 2-fold reduction with tissue damage; or less than 2-fold induction with JNK-pathway inhibition via bsk-DN). These two points are the weakest part of the manuscript and model. Regarding the expression levels, it does not help that the authors show bar graphs with standard errors of the mean instead of the actual datapoints to allow reliable appreciation of the data dispersion. It is difficult to understand how minute changes in miR-184 levels can lead to over an order of magnitude differences (in some cases) in dilp8 mRNA levels considering that it is a stoichiometric relationship. Maybe ?miR-184-Dicer1? complexes are highly stable and re-used for multiple dilp8 transcripts - the authors could discuss how they understand this occurring in their manuscript. On the same line, discussion is also rather weak on what regards the mechanism of control of dilp8 mRNA levels by miR-184. Please discuss eg, the evidence for mRNA degradation induction by microRNAs with this UTR binding profile (imperfect UTR binding Fig S4) and-if appropriate-how other possible regulatory models (direct and indirect) could explain the findings. We suggest the authors carefully revise their citations to cite appropriate work that supports the claims, and also to avoid missing the seminal studies that report the claims they cite. We have the suggestions below which we hope will help the authors improve their manuscript. If the authors address these points raised above, we believe the manuscript should be a valuable contribution to the field, and help in the understanding of how tissues respond to growth aberrations and the regulation of transcript levels by microRNAs.

      Comments:

      Results 1st paragraph: please describe the screen in more detail. As written, one only discovers it was a miRNA loss-of-function screen when reading the legend of Table S1. Please show the original data of the screen - with dispersion if possible.

      Results 1st paragraph, Fourth line, "While several miRNAs caused delays in pupariation by 12 hours or more..". Please correct, as actually loss of miRNAs caused delays.

      Results (Figure 1) - It says that data from three independent experiments are shown. However there is no dispersion in the data. Could the authors please explain this? Are the results of the three experiments summed and presented as one? or is this one of the three?

      It is reported in the legend of Figure S2 that LogRank test was performed to determine statistical significance. However, no statistical data is presented. Please show the results.

      Fig2A and B. Please show the data points in the bar graphs (as in Figure. 2C), or choose another data representation. Please consider redoing statistical analysis with a simple t-test. It is not clear to me why ANOVA was used to compare two samples. Please state that data are normalized also to control (tub-GAL4>UAS-scramble). Please state the h post-hatching from which the RNA samples were collected (as in Fig 2C for 20HE quantification).

      Fig2C. Fig legend states the bar graphs are "absolute values". Please specify if the bar represents the average, median or something else.

      Throughout the manuscript: please use GAL4 in capital letters or at least standardize it throughout the ms. Currently there are GAL4s and Gal4s.. eg compare Fig 2 and 3 legends.

      FigS3A and B. Please revise as Fig2A and B above. and apply the same criteria in the respective figure legend.

      Fig. 4 - please indicate on the figures what is whole larvae and what is wing imaginal discs. This will facilitate understanding of the figure.

      Fig 4 - Data - Authors do not show that rn-GAL4>miR-184-sponge causes up regulation of dilp8 mRNA levels, hence the model is weakened. Doing this experiment would significantly strengthen the study whatever the result is.

      The dilp8-3'UTR experiment is weak especially because its generation is not sufficiently well described in the manuscript. "The dilp8 3'UTR-GFP reporter line was created as described in (Vargheese & Cohen, 2007)" is not sufficient. Please describe the construct generation in sufficient detail so that the experiments can be reproduced by others.

      Making assumptions, if the construct is as described in Vargheese & Cohen, 2007 and contains all of the dilp8 3'UTR - it should be a Tubulin-driven GFP gene with a dilp8-3'UTR "Tub-GFP-(dilp8 3'UTR)". In this case the authors need to rule out the alternative interpretation of the result in Fig. 4D by showing that the expression of miR-184 does not down regulate Tub-GFP expression itself. The best scenario would be to have a mutated dilp8 3'UTR for the miR-184 recognition site. This experiment would significantly strengthen the study and model.

      Figure 4C-D please separate dilp8 from 3'UTR with a space or hyphen.

      Figure 4E. Please name the dilp8 allele as MI00727 as it is not a KO, but rather a hypomorphic mutation (fully WT dilp8 transcripts are still generated, albeit at a much lower level).

      Figure 6D: please add UAS to bskDN/+. All figures have rn-GAL4 alone or with UAS-GFP as control. This finding would be strengthened with this other control, especially because the size effect is small. This being said a general comment for all experiments is that hemi-controls are generally missing for all figures. eg, in Fig 3. One would typically include controls such as A. Phm>+ and +>miR.184; B. aug21>+ and +>miR.184; C. ptth>+ and +>miR.184; D. rn>+ and +>miR.184

      Figure 7: Are IPCs necessary for the model? If not, I suggest removing them and placing the Lgr3 neuron cell bodies much more anterior in this scheme. Their cell bodies are as anterior and rostral as it gets, approximately where the IPCs are depicted in this type of view of the CNS.

      Table S1- It would be preferable to see the data of these experiments, but if the authors prefer to show this data in a table, please at least add the dispersion analyses (eg standard deviation.. OR median+-quartiles OR Confidence intervals..), N of animals analysed, and statistics against controls.

      In all figures with pupariation time: please also indicate significant findings in the graphs (with an asterisk, for instance) and adjust figure legends accordingly. This could facilitate understanding the data.

      Please revise Figure legends for punctuation.

      Abstract: Line 10: What is the evidence to call Dilp8 a "paracrine" factor?

      Introduction:

      4th paragraph, 3rd sentence " Dilp8... buffers developmental noise and delays pupariation..." Buffering of developmental noise was first shown in Garelli et al., Science 2012, so this publication should be cited.

      4th paragraph, 5th sentence: please include Jaszczak et al., Genetics 2016. This paper was published together with the 2015 papers, just a mater of timing that it got a 2016 date. Moreover, I do not think Katsuyama et al., 2015 is well cited to back up the statement in this sentence, hence I recommend removing that citation in this sentence.

      6th paragraph: 5th line "targeting dilp8" : please specify if you mean the gene or the mRNA, or both. Same for line 7.

      Results Page 10, 1st paragraph, 1st sentence: the works cited are not the appropriate studies that demonstrated what is being stated. This was shown in Garelli et al., Science 2012 and Colombani et al., Science 2012.

      Results Page 10, 1st pagragraph, line 11: Please also cite Colombani et al., Science 2012, who first showed that JNK is required for dilp8 regulation.

      Discussion, 2nd paragraph, line 4: again, please indicate the rationale for using "paracrine" to describe Dilp8's activities. The current widely accepted model is that Dilp8 acts on interneurons in the brain (eg, reviewed in Juarez-Carreno et al., Cell Stress, 2018; Gontijo and Garelli, Mech Dev, 2018; Mirth and Shingleton, Front Cell Dev Biol, 2019; Texada et al., Genetics 2020; Boulan and Leopold, 2021). In order to reach the brain, Dilp8 has to be secreted from the discs and travel to the brain. This is as an endocrine mechanism as it gets for a small larva, considering that some discs can be in the opposite side of the larva (eg, genital discs). While this does not exclude that Dilp8 could also act paracrinally, the only evidence that I am aware of comes from other contexts such as during transdetermination (where Dilp8 has been proposed to work in an autocrine or paracrine fashion, via Drl in imaginal discs (Nemoto et al., Genes to Cells, 2023), however, this is not cited appropriately in this manuscript and is less related to the Lgr3-dependent pathway being studied here.

      Discussion Page 13, 1st paragraph, This claim is supported by data presented in Garelli et al., Science 2012, not the other two papers. Garelli et al., 2015 shows that the Lgr3 receptor also participates in buffering developmental noise. Other studies have corroborated the Garelli et al., 2012 finding: eg, Colombani et al., Curr Biol 2015; Boone et al., Nat Commun 2016; Blanco-Obregon et al., Nat Commun 2022). Many other studies have shown that Dilp8 promotes developmental stability under tissue stress and challenges.

      Discussion Page 12, 3rd paragraph, 2nd sentence: "The Lgr3 neurons directly interact with ... PTTH ...and insulin-producing neurons" Please cite Colombani et al., 2015 and Vallejo et al., Science 2015. Vallejo et al., propose that circuit with insulin-producing neurons. In the 3rd sentence, only Jaszczak et al., 2016 is cited, whereas this claim/model comes from many studies, such as Halme et al., Curr Biol, 2010; Hackney et al., PLoS One 2012; Garelli et al. Science 2012; Colombani et al., Science, 2012; and the Lgr3 papers from 2015). Jaszczak et al., actually propose that Lgr3 is also required in the ring gland in addition to neurons.

      Discussion page 14 last paragraph,10 line, "In Aedes aegypti ....regulates ilp8 (Ling et al., 2017)". As far as I understand mosquitoes do not have a dilp8 orthologue (see for instance Gontijo and Gontijo, Mech Dev 2018; and Jan Veenstra's work). ilp nomenclature (numbering) does not follow that of Drosophila, so ilp8 is probably a typical Insulin/IGF-like peptide and is NOT an orthologue of Dilp8, a relaxin, so this citation needs to be removed or placed into the broader context of microRNA regulation of ilps.

      Thank you for the opportunity to review your interesting work, Alisson Gontijo and Rebeca Zanini

      Significance

      If the authors address these points raised above, we believe the manuscript should be a valuable contribution to the field, and help in the understanding of how tissues respond to growth aberrations and the regulation of transcript levels by microRNAs.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Drosophila has helped to characterize the mechanisms that coordinate tissue growth with developmental timing. The insulin/relaxin-like peptide Dilp8 has been identified as a key factor that communicates the abnormal growth status of larval imaginal discs to neuroendocrine neurons responsible for regulating the timing of metamorphosis. Dilp8, derived from imaginal discs, targets four Lgr3-positive neurons in the central nervous system, activating cyclic-AMP signaling in an Lgr3-dependent manner. This signaling pathway reduces the production of the molting hormone, ecdysone, delaying the onset of metamorphosis. Simultaneously, the growth rates of healthy imaginal tissues slow down, enabling the development of proportionate individuals. In this manuscript "miR-184 modulates dilp8 to control developmental timing during normal growth conditions and in response to developmental perturbations" by Dr. Varghese and colleagues, the authors identify a new post transcriptional regulator of Dilp8. The authors show that miR-184 plays a pivotal role in tissue damage responses by inducing dilp8 expression, which in turn delays pupariation to allow sufficient time for damage repair mechanisms to take effect.

      Major points:

      • In most of the experiments for percentage of pupariation, the 50% pupariation in control is around 110 hours AED in figures 1, 2 and 3. In figures 5 and 6 using the UAS Ricin, the controls are more around 90 hours AED. Why this discrepancy?
      • What is the mechanism behind the expression of miR-184 in stress conditions? Does miR-184 also implicated in other conditions giving rise to a developmental delay (X-rays irradiation or animal bearing rasV12, scrib-/- tumors)?
      • dilp8 mutant animals have also been shown to be more resistant to starvation or desiccation (https://doi.org/10.3389/fendo.2020.00461 ). Is miR-184 implicated in this answer?
      • dilp8 expression has been also shown to be regulated by Xrp1 in response to ribosome stress (https://doi.org/10.1016/j.devcel.2019.03.016). This paper should be included in the manuscript Is it possible that the expression levels of miR184 are regulated by Xrp1?

      Minor points:

      • Does the overexpression of miR184 induce an increased fluctuating asymmetry?
      • There are 2 references Colombani et al. (2012 for Dilp8 and 2015 for Lgr3). Can you double check that they are used accordingly

      Significance

      Altogether, the paper present compiling lines of evidence supporting the proposed model. The experiments are well designed and are convincing. The papers is interesting and relevant for a broad audience.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this study, Fernandes and colleagues addressed the question of the role of micro-RNAs in regulating the coupling between organ growth and developmental timing. Using Drosophila, they identified the conserved micro-RNA miR-184 as a regulator of the developmental transition between juvenile larval stages and metamorphosis. This transition is under the control of the steroid hormone Ecdysone, and has been shown to be modulated in case of abnormal tissue growth to adjust the duration of larval growth in response to developmental perturbations. The relaxin-like hormone Dilp8 has been identified as a key secreted factor involved in this coupling. Here, the authors show that miR-184 is involved in the regulation of Dilp8 expression both in physiological conditions and upon growth perturbation. They propose that this function is carried out in imaginal tissues, where miR-184 levels are modulated by tissue stress. While several factors have already been involved in triggering sharp dilp8 induction at the transcriptional level, this study adds another level of complexity to the regulation of Dilp8 by proposing that its expression is fine-tunned post-transcriptionally through repression by miR-184.

      Major Comments

      Overall, the manuscript is well organized, and the logics of the experimental plan well presented. The results are clear, and I appreciate the quality of the pupariation curves. However, I believe that two main conclusions of the paper are not fully supported by the results presented in the figures: the direct regulation of dilp8 3'UTR by miR-184, and the specificity of this regulation in imaginal discs. Here I develop in more details these two aspects. 1. The strategy of the 3'UTR sensor is not fully optimized. Indeed, in most experiments, qRT-PCR is used to assess dilp8 expression levels, although it reflects both transcriptional and post-transcriptional. Importantly, to show that post-transcriptional regulation is involved in the response to tissue damage, the levels of the 3'UTR sensor should be analyzed in discs expressing RAcs (showing at the same time that the response is cell-autonomous in the discs). The expected upregulation of the sensor should be prevented by simultaneous expression of miR-184. This approach would shed light on the relative contribution of transcriptional versus post-transcriptional regulation of dilp8 in response to growth perturbation. 2. In my opinion, the use of a 3'UTR sensor is not sufficient to conclude that the regulation by miR-184 is direct, as miR-184 could also regulate an intermediate factor that acts on dilp8 post-transcriptional regulation. To solve this issue, a common strategy is to generate a 3'UTR sensor with mutated binding sites that should abolish the regulation by miR-184. This mutated 3'UTR might also respond differently to tissue damage, which would strongly support the conclusions of the study. 3. Concerning the tissue-specific regulation of Dilp8 by miR-184, these results need to be strengthened. Indeed, this comes mostly from phenotypes observed with rn-GAL4. Although this is a classical tool for driving expression in imaginal discs, rn-GAL4 also drives strong expression in other tissues that could contribute to triggering a delay, such as the CNS and part of the gut (proventriculus). In our hands, some growth phenotypes in the wing obtained with rn-GAL4 could be fully reverted by blocking GAL4 in the CNS indicating that the phenotype was not wing-specific. Importantly, miR-184 seems to be highly expressed in the CNS according to FlyBase, reinforcing the possibility that it plays a role in this organ. Here I propose approaches to confirm that miR-184 mediated regulation of dilp8 and developmental timing indeed occur in the discs: - Another driver with less secondary expression sites could be used (pdmR11F02-GAL4), or rn-GAL4 could be combined with an elav-GAL80 to prevent expression in most neurons. - The authors could identify the source of Dilp8 upregulation in miR-184 mutants using tissue-specific qRT-PCR instead of whole larvae expression like in Fig 4A-B. - This tissue-specific upregulation could be functionally tested using a rescue experiment, in which the delay observed in miR-184 mutants could be rescued by disc-specific downregulation of Dilp8 (using pdm2-GAL4 for instance).

      Optional: Because it is known that dilp8 is strongly regulated at the transcriptional level, the relative input from post-transcriptional upregulation is an important question arising from this study. Although it might be a more long-term approach, I believe that generating a Dilp8 mutant lacking its 3'UTR or, even better, with mutated miR-184 binding sites, would shed light on the role of this regulation for the response to growth perturbation and/or developmental stability (fluctuating asymmetry).

      Minor Comments

      • I think that a number of results could be moved to SI as they are either controls, or reproduce published data without bringing novelty. For instance, results in Fig 5A-D are similar to data published by Sanchez et al, as stated in the text. Fig6A as well.
      • Fig 6D is quite mysterious, as it suggests that basal JNK activation regulates miR-184, which is different from a context of tissue damage. I think that this result could be removed. Alternatively, if the authors want to dig in that direction, more experiments should be provided, such as bskDN expression in an RAcs context and the effects on miR-184 levels and the 3'UTR sensor (since transcript levels are already published).
      • The references related to Dilp8 should be checked more in detail in the intro and discussion. About Dilp8 and developmental stability: remove the ref to Colombani et al 2012, instead put Boone et al 2016 and add Blanco-Obregon et al 2022 (in addition to Garelli et al 2012 who initially identified this phenotype. About Lgr3 as the receptor for Dilp8: add Colombani et al, Current Biology 2015, and cite here Vallejo et al 2015, Garelli et al 2015. Among the important transcriptional regulators of Dilp8, Xrp1 could be mentioned (Boulan et al 2019, Destefanis et al 2022) as it plays a complementary function to JNK depending on the type of tissue stress.

      Significance

      General Assessment

      This study provides convincing data showing that the conserved microRNA miR-184 plays a role in regulating developmental timing in Drosophila through modulating the levels of Dilp8, a key factor in the coupling between tissue growth and developmental transitions. The results are convincing, but the general conclusions of the paper need to be strengthened regarding the direct regulation of dilp8 by miR-184 and the tissue-specificity of this interaction.

      Advance

      Dilp8 is a key factor that modulates growth and timing in response to developmental perturbations and contributes to developmental precision in physiological conditions. As such, its regulation has been studied by different groups in the last decade, leading to the identification of several inputs for its transcriptional regulation. Here, the authors uncover a post-transcriptional regulation by miR-184, adding another level of regulation of Dilp8 that contribute to ensuring proper regulation of developmental timing, and opening the possibility that miR-184 might play similar roles in other species.

      Audience

      This study is of interest for researchers in the field of basic science, with a focus on developmental timing, tissue damage and biological function of microRNAs.

      Reviewer expertise

      Drosophila, growth control, developmental timing, Dilp8.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewer for their positive comments regarding the research article titled "The Ketogenic Diet Metabolite 1 β-Hydroxybutyrate Promotes Mitochondrial Elongation via Deacetylation and Improves Autism-like Behaviour in Zebrafish" by Uddin GM and colleagues. We appreciate your input, and we will address these comments as indicated below with specific responses to each point raised by reviewers.

      The main changes in the updated manuscript are as follows:

      We have revised the introduction to now incorporate additional background information on mitochondria, NAD, and mitochondrial dynamics and function. This addition aims to provide readers with a broader understanding of the mitochondrial context in relation to our study.

      Furthermore, we recognize that previous studies have explored mitochondrial function in the context of the ketogenic diet. While our specific investigation centered on mitochondrial morphology, we acknowledge the importance of comprehensively investigating mitochondrial function. To this end, we have added new data showing how BHB impacts mitochondrial oxidative phosphorylation in HeLa cells (Sup Fig 2), and how both BHB and NMN impact oxygen consumption/glycolysis in zebrafish (Fig 7).

      We have also added new behaviour analysis of the zebrafish (Fig 6), and have re-framed the discussion around neurodevelopment generally, rather than ASD specifically.

      Finally, we have now included a section in our manuscript that discusses the limitations of our study. These limitations can be further investigated to explore and characterize the full mechanistic potential behind the effects of the ketogenic diet and/or NMN on mitochondrial dynamics.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Uddin GM and colleagues presented a research article entitled 'The Ketogenic Diet Metabolite 1 β-Hydroxybutyrate Promotes Mitochondrial Elongation via Deacetylation and Improves Autism-like Behaviour in Zebrafish'. Roles of ketogenic diet (KD) and NAD+ precursors in health promotion and longevity, as well as on the alleviation of a broad range of diseases are evident. However, their roles in autism are not well done, which is the novelty of the current study. Addressing below questions will improve the quality of the paper.

      Major concerns 1. In the introduction section, a broad overview of the roles of ketogenic diet (KD) in neurodegenerative disease (and ageing, if possible) should be provided. E.g., the authors should summarize exciting progress on the use of KD to treat Alzheimer's disease in animal models (PMID: 23276384). *

      Response: Thank you for your valuable suggestion. While it is true that the KD appears to be beneficial in neurodegenerative (and other disease) models, our focus in this paper is looking at neurodevelopment, rather than all potential benefits of the KD. Nonetheless, we have addressed this comment by incorporating a brief overview of the roles of the KD in neurodegenerative diseases, including Alzheimer's disease (AD), in the introduction section of the manuscript. Specifically, we have summarized the exciting progress made in utilizing KD to treat AD in animal models, as highlighted in the suggested study. This addition helps to provide a better overview of the potential therapeutic effects of KD in neurodegenerative diseases and strengthens the introduction section of the manuscript.

      • Roles of high fat diet to treat diseases could be extended to rare premature ageing diseases. In such scenario, high fat and NAD+ boosting shared some joint mechanisms (PMID: 25440059 ). *

      Response: This information and the reference are now added to the discussion.

      *In the introduction, a more detailed introduction of NAD+ and its roles in mitochondrial homeostasis (especially mitophagy and the mitochondrial fusion-fission balance) should be included (PMID: 24813611; PMID: 30742114; PMID: 31577933). *

      Response: Although our paper focused primarily on mitochondrial fission and fusion, we have incorporated a new paragraph in the introduction to provide a more detailed introduction detailing NAD+ and its roles in mitochondrial homeostasis, specifically highlighting mitophagy. We have included the suggested references.

      • In regarding to the statement of KD increases NAD+, was it due to increased generation (to check protein levels and activities of different NAD+ synthetic enzymes, such as iNAMPT, NMNAT1-3, and NRK) and/or reduced consumption (in addition to reduced glycolysis, does KD inhibit the activities of CD38 and PARPs? In this paper, Sirtuins' activities is (are increased)). Detailed exploration of the activities of these proteins will unveil a clear molecular mechanisms on how KD affects/regulates NAD+. *

      Response: Thank you for the comment. We agree that exploring the detailed mechanism of how the ketogenic diet (KD) affects NAD+ is an interesting question that will have important implications once answered. However, fully elucidating the mechanism of action would require a more comprehensive investigation, which is beyond the scope of this current project. We have now added this as a future direction in the manuscript.

      *Fig. 1: in the NAD+ field, the normal used NR/NMN concentrations are normally high like to use 500 µM to 2-5 mM (as the NAD+ levels in cells are high). In addition to use 50 µM, the authors are strongly to have a dose-dependent study (50 µM, 500µM, 1, 2, 5 mM), and see changes of mitochondrial funciton and parameters. In this condition, NAD+ levels should be also checked. *

      Response: We have added new supplemental data showing the initial dose response of the effects of BHB and NMN on mitochondrial morphology, which led us to choosing the relevant doses for the remainder of the paper. Our objective was not to investigate the broad impacts of different NMN concentrations on mitochondrial function and parameters, or NAD+ levels. As such, we have only focused on doses where we see effects on mitochondrial morphology.

      *Fig. 2: a comprehensive characterization of mitochondrial fusion-fission should be performed. In addition to the protein evaluated, changes on other key fusion-fission proteins, like Bax, Bak, Mfn-1, Mfn-2, etc should be performed (PMID: 17035996; PMID: 24813611). *

      Response: We agree that looking at other key proteins involved in mediating mitochondrial fission and fusion could provide additional insight. Indeed, given the changes in global acetylation that we see, it is expected that some other proteins may also be regulated in this way. However, there are at least a dozen proteins involved in mediating mitochondrial fusion and fission, not to mention many more proteins that regulate these proteins. Unfortunately, it is not feasible to analyze all the proteins involved in mitochondrial fusion-fission. Moreover, looking only at protein levels, doesn't necessarily inform about the activity of any protein. Instead, we concentrated in this paper on investigating known links between protein acetylation and mitochondrial dynamics, particularly focusing on the proteins that have known links to acetylation (i.e., DRP1, OPA1, MFNs). We have added a note in the discussion acknowledging that other means of regulation could also be occurring in parallel.

      *Figs. 1-5 were focused on mitochondrial morphology, whether KD and NMN changed mitochondrial funciton should be explored, such as to use seahorse to check ECR and OCR. *

      Response: Although our question was focused on morphology, we agree that mitochondrial function is important. We have added new data showing that BHB increases basal oxygen consumption in HeLa cells (Sup Fig 2), as well as new data showing that BHB and NMN influence oxygen consumption and glycolysis in our zebrafish model (Fig 7)

      • Fig. 6: NR/NMN used in animal studies (via gavage or in drinking water in mice, and on plate for worms and flies) are normally high (e.g., in drinking water for mice could be 4-12 mM; for worms and flies are normally 1-5 mM); for zebrafish, while they are swimming in water, this reviewer concerned whether it was true that 50 µM of NMN was sufficient to show the benefit presented.*

      Response: Our data show that these doses are indeed sufficient. We did look at some higher doses for NMN, but these were toxic, leading to poor survival and were not studied further.

      *Minor concerns 1. Line 26: For 'a growing list of neurological disorders, including autism spectrum disorder (ASD)', please add AD in. *

      Response: Line 26 is part of the abstract, which we feel should be focused more on the main message of the paper, which does not involve AD. As addressed above, we have added AD as an example in the introduction.

      *Line 57: For 'with side effects such as gastrointestinal disturbances, nausea/vomiting, diarrhea, constipation, and hypertriglyceridemia being reported', rate of frequency shall be provided if any. *

      Response: We have modified the statement to indicate the relative percent of patients suffering the various side effects.

      *Reviewer #1 (Significance (Required)):

      The novelty of the current study was to investigate effects of KD and NAD+ on autism. This investigation was not performed before and thus is the novelty.

      Weakness, effects of KD and NAD+/NMN on mitochondrial function were not well-investigated and should be done. Introduction was not well done, many key information in the fields were not provided which may mislead the readers an over-evaluation of the novelty of the current study.*

      Response: As outlined above, we have edited the introduction to include additional information requested by the reviewer. Moreover, our focus in this manuscript was to look at the mechanisms underlying changes in mitochondrial morphology, not mitochondrial function per se, though this is clearly important and related. Nonetheless, as discussed above, we have also added new data showing how BHB impacts mitochondrial function.

      *My expertise lies in NAD+, mitochondria, and brain health.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The study examined the effect of beta-hydroxybutyrate and nicotinamide nucleotide on mitochondrial morphology and the molecular pathways which mitigate this effect as well as the effect of these treatments on behavior in zebrafish. The study is well done and well written. The only thing I think that could be improved are the bar in the graph some the significant comparisons. It is sometimes difficult to see which groups are being compared.*

      Response: We're happy to adjust how the data is displayed in the relevant bar graphs, but it is not clear exactly what changes the reviewer would like. To some degree this will depend on the specific guideline of the final journal where we hope the manuscript will be published. As such, we have not made changes at this point.

      ***Referees cross-commenting**

      The other reviewers do have some fair comments. Multiple doses would be helpful and showing bioenergetic data would complement the morphological measurements. Additionally, behavioral assays showing changes in social behavior in the Zebrafish would provide a stronger link to ASD. *

      Response: As discussed above, we have added new information on doses and mitochondrial bioenergetics. With respect to behaviour, we have added thigmotaxis data and reworked the discussion around behaviour and neurodevelopment so that it is less specific to ASD.

      *Reviewer #2 (Significance (Required)):

      As beta-hydroxybutyrate is an important substrate for the ketogenic diet, this study helps explain the potential mechanisms in which the ketogenic diet may enhance mitochondrial function.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this paper, Uddin and colleagues have investigated components of the ketogenic diet to understand changes in both mitochondrial morphology and protein expression, and zebrafish locomotor behaviour. They investigate whether beta-hydroxybutyrate (BHB) or nicotinamide nucleotide (NMN) application can later human mitochondria in HeLA cell lines, and also recue a locomotion defect in shank3b+/- zebrafish larvae that have previously been proposed as a model for autism. This study is strengthened by showing data from two species; however the link between the HeLA cell line data and larval zebrafish is not strong. The study would be improved by assessing zebrafish mitochondrial changes after drug application, and testing more than one concentration of BH and NMN in the behavioural assay. This is an interesting study, and it is nicely written and presented. I have made some comments to strengthen the study below.

      Major comments My expertise is in modelling some aspects of autism in zebrafish. To this end I have focussed on the zebrafish part of this manuscript more fully. I have several comments related to the zebrafish experiments. 1. The changes in mitochondrial morphology, peroxisome number and mitochondrial protein levels were measured in HeLA cells and not comparable data is shown for zebrafish. The same experiments should be repeated using larval zebrafish or a zebrafish cell line. *

      Response: We chose to use HeLa cells for the mechanistic studies due to practical reasons. Cell lines offer a controlled and well-established system for investigating cellular processes and molecular mechanisms. Measuring these parameters in tissues is significantly more challenging and requires different reagents (e.g., antibodies) and methodology (electron microscopy) that are not feasible in the current study.

      On the other hand, zebrafish larvae were employed for the behavior studies, which cannot be conducted using cell lines. By utilizing zebrafish, we were able to examine the effects of beta-hydroxybutyrate (BHB) and nicotinamide nucleotide (NMN) on locomotor behavior, providing valuable insights into potential therapeutic implications for autism.

      While we acknowledge the limitations of not directly measuring mitochondrial morphology, peroxisome number, and mitochondrial protein levels in zebrafish, we believe that our study provides significant contributions to understanding the effects of BHB and NMN in zebrafish behavior. Future studies could certainly consider incorporating zebrafish-specific experiments to complement the findings in HeLa cells.

      • How did you choose the concentration of BHB and NMN to use in behavioural experiments? And the timing of application - I don't really understand why you waited 3 days after drug application to measure locomotion. *

      Response: These doses chosen initially as they were similar the doses that induced mitochondrial elongation in HeLa cells and were tolerated by the fish larvae. As we saw promising effects at these initial doses, we decided to explore them in more detail. While we agree that it would be worth comparing the effects of additional doses, as well as looking at their effects at other timepoints, such work would be a major endeavour and is beyond the scope of our initial investigations, which we feel are worth reporting in their current state.

      With respect to the treatment paradigm, fish larvae were treated 10-48 hours post fertilization, as this is a critical neurogenic developmental timepoint that is often used for exposure studies. Fish do not fully hatch until 3-4 days post fertilization, and display only minimal movement before 5 days, which is why we waited until 5 days to look at movement.

      • Do the shank3b+/- larvae show any morphological deficits? Their decrease in locomotion is striking. Is the morphology also rescued by drug application? Can you tie this to the mitochondrial changes that you observed in HeLA cells?*

      Response: We do not observe any gross changes in fish morphology that might explain a decrease in locomotion. Unfortunately, it is not feasible to look at mitochondrial morphology in the fish at this time. However, based on previous published work showing that the ketogenic diet promotes mitochondrial elongation in mouse brains (PMID:32380723), we would expect mitochondrial morphology also to be changed in the fish. Nonetheless, as we have not examined this directly in fish, we are not making this specific claim in this manuscript.

      • In figure 6A you use time spent swimming as a readout of distance. This doesn't really make sense, because without also showing speed of swimming it is not possible to know whether time and distance correlate in the same way across genotypes. This figure could be improved by showing more detail - speed of swimming, time spent immobile etc. This can easily be extracted from the films that you have already made using the ViewPoint software. *

      Response: As requested, we have reanalyzed the zebrafish movement data for a more refined analysis. In the revised version (Fig 6), we include analysis of both speed and distance travelled within a defined time. Importantly, these findings still support differences between WT and shank3b+/- fish that are restored by BHB and NMN to varying degrees.

      • Showing a change in locomotion is not enough to claim that a model is autism-like. At a minimum I think that you need to show changes in social behaviour - likely using older fish (more than three weeks) that interact with each other. Changes in locomotion can be caused by so many factors, many of which are not indicative of autism. It is important that as a field we do not simply claim that locomotion can be used as a proxy for more complex disease phenotypes. This recent review may help you with this point:* https://www.frontiersin.org/articles/10.3389/fnmol.2020.575575/full.

      Response: The reviewer makes an important point that the movement behaviour phenotypes that we see do not necessarily represent classic ASD phenotypes (i.e., repetitive behaviour, reduced sociability, and reduced communication). To begin to address this issue, we analyzed thigmotaxis, which can be a measure of anxiety. Notably, we also see differences that are reversed by BHB and NMN. However, we cannot model all ASD behaviours in a fish model, and we are not set up to look at social behaviour, especially in the young fish that we were studying. As such, even though Shank3 is a recognized ASD gene, and the shank3b+/- model we are studying is a validated ASD model (PMID: 29619162), we have re-phrased the manuscript in the context of neurodevelopment generally, rather than with respect to ASD specifically. As such, we ascribe the movement and thigmotaxis phenotypes as neurodevelopmental phenotypes that are improved by BHB and NMN.

      *For the statistics, as far as I can tell, all of the data should be analysed by ANOVA or the non-parametric equivalent followed by a post-hoc test. Please check this and add information about normality in. *

      Response: As requested, we have clarified our statistical methodology throughout the manuscript.

      For the mechanistic data, we used t-tests for direct comparisons between two groups (e.g., vehicle vs. treatment). While multiple conditions such as vehicles, NMN, BHB, or etomoxir were tested, statistical comparisons were only conducted comparisons between the vehicle and each treatment group individually. As we are not also making comparisons between treatments this is not a multiple comparison, and ANOVA is not applicable in this context. We have clarified this rationale in the manuscript to avoid any confusion.

      For the zebrafish study, where multiple factors were involved (e.g., treatments across different time points or conditions), we performed a two-way ANOVA followed by Tukey's post-hoc test to identify specific group differences. This approach was appropriate for analyzing these datasets and ensures robust conclusion.

      With respect to normality testing, all datasets were assessed for normality using the Shapiro-Wilk test, and no violations of normality were observed. The updated text now includes these details.

      *Minor comments

      1. Make sure that you refer to the fish line as shank3b+/- throughout - see abstract.*

      This has bee corrected.

      • Please add a space between all numbers and units (e.g. 5 Mm). *

      This has bee corrected.

      • There is a spelling error on line 340 page 16: finings instead of findings. *

      This has bee corrected.

      • In figure 1, if each dot represents a different sample, then there appear to be many fewer samples analysed in 1D compared to 1B. Can you comment upon this please*

      __Response: __A total of 80-150 cells were counted per condition, and the analyses were performed on 3 independent replicates with 2 independent technical replicates for each treatment condition. The quantification of mean mitochondrial branch length in Figure 1B was measured using Image-J and the MiNA plugin. The measurements were taken from three independent replicates using a standard region of interest (ROI) and randomly selected areas from each image.

      In Figure 1D, NAD+ levels were measured 24 hours after treatment of vehicle, βHB, NMN, or Eto+βHB in HeLa cells (n=3-6/group). Each sample lysate represents an independent experimental dish from which coverslips were collected for image analysis.

      The difference in sample numbers between Figure 1B and 1D arises because image analysis involves individual cells fixed and stained on coverslips, whereas the NAD assay requires the whole lysate from the entire cell culture dish. Therefore, the higher cell count in Figure 1B represents the number of cells analyzed on coverslips, while Figure 1D represents NAD levels from the lysate normalized to the protein concentration.

      *Reviewer #3 (Significance (Required)):

      I think that this will be interesting to autism researchers and it could lead to more investigation of the ketogenic diet. Some more work is needed, likely in other model organisms, before this research can be translated to human patients. *

      __Response: __We agree that the findings of our study could be of interest to autism researchers and have implications for further investigation of the ketogenic diet (KD). It is important to note that further work, including studies in other model organisms, would be beneficial before translating this research to human patients.

      Our study aimed to provide mechanistic insights into the effects of the KD on mitochondrial morphology and behavior. We recognize that the translation of research findings to human patients requires rigorous investigation, including preclinical and clinical studies. Our study contributes to the understanding of the underlying mechanisms involved in the KD's effects, laying the groundwork for future research and potential therapeutic avenues.

      We appreciate your perspective and emphasize that our intention is to provide valuable insights into the mechanisms underlying the KD's effects rather than suggesting immediate translation to human patients. Further investigation and validation in diverse models and clinical settings will be necessary before considering clinical applications.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, Uddin and colleagues have investigated components of the ketogenic diet to understand changes in both mitochondrial morphology and protein expression, and zebrafish locomotor behaviour. They investigate whether beta-hydroxybutyrate (BHB) or nicotinamide nucleotide (NMN) application can later human mitochondria in HeLA cell lines, and also recue a locomotion defect in shank3b+/- zebrafish larvae that have previously been proposed as a model for autism. This study is strengthened by showing data from two species; however the link between the HeLA cell line data and larval zebrafish is not strong. The study would be improved by assessing zebrafish mitochondrial changes after drug application, and testing more than one concentration of BH and NMN in the behavioural assay.

      This is an interesting study, and it is nicely written and presented. I have made some comments to strengthen the study below.

      Major comments

      My expertise is in modelling some aspects of autism in zebrafish. To this end I have focussed on the zebrafish part of this manuscript more fully. I have several comments related to the zebrafish experiments.

      1. The changes in mitochondrial morphology, peroxisome number and mitochondrial protein levels were measured in HeLA cells and not comparable data is shown for zebrafish. The same experiments should be repeated using larval zebrafish or a zebrafish cell line.
      2. How did you choose the concentration of BHB and NMN to use in behavioural experiments? And the timing of application - I don't really understand why you waited 3 days after drug application to measure locomotion.
      3. Do the shank3b+/- larvae show any morphological deficits? Their decrease in locomotion is striking. Is the morphology also rescued by drug application? Can you tie this to the mitochondrial changes that you observed in HeLA cells?
      4. In figure 6A you use time spent swimming as a readout of distance. This doesn't really make sense, because without also showing speed of swimming it is not possible to know whether time and distance correlate in the same way across genotypes. This figure could be improved by showing more detail - speed of swimming, time spent immobile etc. This can easily be extracted from the films that you have already made using the ViewPoint software.
      5. Showing a change in locomotion is not enough to claim that a model is autism-like. At a minimum I think that you need to show changes in social behaviour - likely using older fish (more than three weeks) that interact with each other. Changes in locomotion can be caused by so many factors, many of which are not indicative of autism. It is important that as a field we do not simply claim that locomotion can be used as a proxy for more complex disease phenotypes. This recent review may help you with this point: https://www.frontiersin.org/articles/10.3389/fnmol.2020.575575/full.
      6. For the statistics, as far as I can tell, all of the data should be analysed by ANOVA or the non-parametric equivalent followed by a post-hoc test. Please check this and add information about normality in.

      Minor comments

      1. Make sure that you refer to the fish line as shank3b+/- throughout - see abstract.
      2. Please add a space between all numbers and units (e.g. 5 Mm).
      3. There is a spelling error on line 340 page 16: finings instead of findings.
      4. In figure 1, if each dot represents a different sample, then there appear to be many fewer samples analysed in 1D compared to 1B. Can you comment upon this please?

      Significance

      I think that this will be interesting to autism researchers and it could lead to more investigation of the ketogenic diet. Some more work is needed, likely in other model organisms, before this research can be translated to human patients.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The study examined the effect of beta-hydroxybutyrate and nicotinamide nucleotide on mitochondrial morphology and the molecular pathways which mitigate this effect as well as the effect of these treatments on behavior in zebrafish. The study is well done and well written. The only thing I think that could be improved are the bar in the graph some the significant comparisons. It is sometimes difficult to see which groups are being compared.

      Referees cross-commenting

      The other reviewers do have some fair comments. Multiple doses would be helpful and showing bioenergetic data would complement the morphological measurements. Additionally, behavioral assays showing changes in social behavior in the Zebrafish would provide a stronger link to ASD.

      Significance

      As beta-hydroxybutyrate is an important substrate for the ketogenic diet, this study helps explain the potential mechanisms in which the ketogenic diet may enhance mitochondrial function.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Uddin GM and colleagues presented a research article entitled 'The Ketogenic Diet Metabolite 1 β-Hydroxybutyrate Promotes Mitochondrial Elongation via Deacetylation and Improves Autism-like Behaviour in Zebrafish'. Roles of ketogenic diet (KD) and NAD+ precursors in health promotion and longevity, as well as on the alleviation of a broad range of diseases are evident. However, their roles in autism are not well done, which is the novelty of the current study. Addressing below questions will improve the quality of the paper.

      Major concerns

      1. In the introduction section, a broad overview of the roles of ketogenic diet (KD) in neurodegenerative disease (and ageing, if possible) should be provided. E.g., the authors should summarize exciting progress on the use of KD to treat Alzheimer's disease in animal models (PMID: 23276384).
      2. Roles of high fat diet to treat diseases could be extended to rare premature ageing diseases. In such scenario, high fat and NAD+ boosting shared some joint mechanisms (PMID: 25440059 ).
      3. In the introduction, a more detailed introduction of NAD+ and its roles in mitochondrial homeostasis (especially mitophagy and the mitochondrial fusion-fission balance) should be included (PMID: 24813611; PMID: 30742114; PMID: 31577933).
      4. In regarding to the statement of KD increases NAD+, was it due to increased generation (to check protein levels and activities of different NAD+ synthetic enzymes, such as iNAMPT, NMNAT1-3, and NRK) and/or reduced consumption (in addition to reduced glycolysis, does KD inhibit the activities of CD38 and PARPs? In this paper, Sirtuins' activities is (are increased)). Detailed exploration of the activities of these proteins will unveil a clear molecular mechanisms on how KD affects/regulates NAD+.
      5. Fig. 1: in the NAD+ field, the normal used NR/NMN concentrations are normally high like to use 500 µM to 2-5 mM (as the NAD+ levels in cells are high). In addition to use 50 µM, the authors are strongly to have a dose-dependent study (50 µM, 500µM, 1, 2, 5 mM), and see changes of mitochondrial funciton and parameters. In this condition, NAD+ levels should be also checked.
      6. Fig. 2: a comprehensive characterization of mitochondrial fusion-fission should be performed. In addition to the protein evaluated, changes on other key fusion-fission proteins, like Bax, Bak, Mfn-1, Mfn-2, etc should be performed (PMID: 17035996; PMID: 24813611).
      7. Figs. 1-5 were focused on mitochondrial morphology, whether KD and NMN changed mitochondrial funciton should be explored, such as to use seahorse to check ECR and OCR.
      8. Fig. 6: NR/NMN used in animal studies (via gavage or in drinking water in mice, and on plate for worms and flies) are normally high (e.g., in drinking water for mice could be 4-12 mM; for worms and flies are normally 1-5 mM); for zebra fish, while they are swimming in water, this reviewer concerned whether it was true that 50 µM of NMN was sufficient to show the benefit presented.

      Minor concerns

      1. Line 26: For 'a growing list of neurological disorders, including autism spectrum disorder (ASD)', please add AD in.
      2. Line 57: For 'with side effects such as gastrointestinal disturbances, nausea/vomiting, diarrhea, constipation, and hypertriglyceridemia being reported', rate of frequency shall be provided if any.

      Significance

      The novelty of the current study was to investigate effects of KD and NAD+ on autism. This investigation was not performed before and thus is the novelty.

      Weakness, effects of KD and NAD+/NMN on mitochondrial function were not well-investigated and should be done. Introduction was not well done, many key information in the fields were not provided which may mislead the readers an over-evaluation of the novelty of the current study.

      My expertise lies in NAD+, mitochondria, and brain health.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful comments and overall very supportive feedback.

      Reviewer #1 writes: "The study is very thorough and the experiments contain the appropriate controls. (...) The findings of the study can have relevance for human conditions involving disrupted mitochondrial dynamics, caused for example by mutations in mitofusins." Reviewer #2 writes: "The dataset is rich and the time-resolved approach strong." Reviewer #3 writes: "I admire the philosophy of the research, acknowledging an attempt to control for the many possible confounding influences. (...) This is a powerful and thoughtful study that provides a collection of new mechanistic insights into the link between physical and genetic properties of mitochondria in yeast."

      We address all points below. We have not yet updated our text and figures since we expect substantial additions from new experiments. But we have included Figure R1 with some additional analyses of existing data at the bottom of the manuscript.

      Reviewer1

      1.1 Statistical comparisons are missing throughout the manuscript (with the exception of Fig. 2c). Appropriate statistical tests, along with p-values, should be used and reported where different gorups are compared, for example (but not limited to) Fig. 3d and most panels of Fig. 4.

      We initially decided not to add too many extra labels to the already very busy plots, given that the magnitude of change mostly speaks for itself. However, we will try to find meaningful statistical tests together with a sensible graphical representation for all of the figures. For one example see Figure R1A.

      1.2. I do not agree with the use of Atp6 protein as a direct read-out of mtDNA content. While Atp6 protein levels will decrease with decreasing mtDNA content, the inverse is not necessarily true: decreased Atp6 protein levels do not necessarily indicate decreased mtDNA levels, because they could alternatively or additionally be caused by decreased transcription and/or translation. Therefore, please do not equate Atp6 protein levels to mtDNA levels, and instead rephrase the text referencing the Atp6 experiments in the Results and Discussion sections to measure "mtDNA expression" or "mt-encoded protein" or similar. For example, on p. 14 line 431 should read "mtDNA expression" rather than "decreased synthesis of mtDNA", and line 440 on the same page "mean mtDNA levels" should be "mtDNA expression" or similar.

      All three reviewers agree that using Atp6-NG as a direct proxy for mtDNA requires more validation, or at least rephrasing of the text. We agree that this is the most important point to address. We had previously tried using the mtDNA LacO array (Osman et al. 2015) to directly assess the amount of nucleoids per cell. However, the altered mitochondrial morphology of the Fzo1 depleted cells combined with the LacI-GFP which is still in mitochondria even when mtDNA is gone, increases the noise level to a point that we cannot interpret the signal. However, as this manuscript was in the submission process, the Schmoller lab (co-authors #2 and #7) adapted the HI-NESS system to label mtDNA in live yeast cells(Deng et al. 2025). This system promises much better signal to noise and we expect we can address all concerns regarding the actual count of nucleoids per cell. Should this unexpectedly fail for technical reasons, we will try to calibrate the Atp6-levels with DAPI staining at defined time points and will rephrase the text as the reviewer suggests.

      1.3. In Fig. 3, the authors use the fluorescence intensity of a mitochondrially-targeted mCardinal as a read-out of mitochondrial mass. Please provide evidence that this is not affected by MMP, either with relevant references or by control experiments (e.g. comparing it to N-acridine orange or other MMP-independent dyes or methods).

      Whether or not the import of any mitochondrial protein is dependent on the MMP depends largely on the signal sequence. The preSu9-signaling sequence was previously characterized as largely independent of the MMP compared to other presequences (Martin, Mahlke, and Pfanner 1991), which is why Vowinckel (Vowinckel et al. 2015) and others (Di Bartolomeo et al. 2020; Perić et al. 2016; Ebert et al. 2025) have previously used this as a neutral reference to the strongly MMP-dependent pre-Cox4 signal to estimate MMP. As one control in our own data, we consider that the population-averaged mitochondrial fluorescent signal Figure S3C stays constant in the first few hours, in agreement with the total averaged mitochondrial proteome (Fig R1E). As additional controls, we plan to compare the signal to an MMP independent dye as the reviewer suggests.

      1.4. In Fig. 2e-f, the authors use a promoter reporter with Neongreen to answer whether the reduced levels of the nuclear-encoded mitochondrial proteins Mrps5 and Qcr7 are due to decreased expression or to protein degradation, and find no evidence of degradation of the Neongreen reporter protein. However, subcellular localization might affect the availability of the protein to proteases. Although not absolutely required, it would be relevant to know if the Neongreen fusion protein is found in the same subcellular compartment as Mrps5 and Qcr7 at 0h and 9h after Fzo1 depletion.

      Here, it seems we need to explain the set-up and interpretation of the data better. The key point we are trying to make with the promoter-Neongreen construct is that the regulation is not mainly at the level of transcription. We are showing that the reduction in the levels of the actual protein (orange bars) is not (mainly) explained by a reduction in expression, since the promoter is similarly active at 0 and at 9 hours (grey bars). If expression from the promoter were strongly reduced, the Neongreen would be diluted with growth and would also decrease, but this is not the case. The fluorophore itself is just floating around in the cytosol and is not subject to the same post-translational regulation as Mrps5 and Qcr7, so there is no reason to expect degradation.

      1.5. Fzo1 depletion leads to a very rapid drop in MMP during the first hour of depletion. In the Discussion, can the authors speculate on the possible mechanism of this rapid MMP drop that occurs well before mtDNA or mt-encoded proteins are decreased in level?

      This is indeed an interesting point. We think there are likely three reasons causing this initial drop: Firstly, due to the fragmentation the mixing of mitochondrial content is disturbed and smaller fragments may have suboptimal stoichiometry of components (see also (Khan et al. 2024) who look at this in detail including the Fzo1 deletion); secondly, already fairly early, some mitochondrial fragments may not contain any mtDNA and therefore will be unable to synthesize ETC proteins; thirdly, altered morphological features like changes in the surface-to-volume ratios may play a role. Sadly, mechanistically following up on this is not possible with the tools in our hands and therefore outside of the scope of this manuscript. But we are happy to include these speculations in our discussion.

      1.6. In Fig. 2a, the mtDNA copy number of Fzo1-depleted cells is ca 1.3-fold of the control cells at the 0h timepoint. Why might this be? Is it an impact of one of the inducers? If so, we might be looking at the combination of two different processes when measuring copy number: one that is an induction caused by the inducer(s), and the other a consequence of Fzo1 depletion itself.

      We believe that this 30% increase is within the noise of the experiment rather than an effect of the induction. Since we normalize to t=0 uninduced, the first black data point does not have error bars, emphasizing this difference. None of the protein data suggests that there is an increase in mtDNA encoded proteins (see e.g. 2B, or Atp6 fluorescence data). In the planned HI-NESS experiment, we will see in our single cell data whether there is an actual increase in mtDNA upon TIR induction. Additionally, we will run a qPCR to carefully determine mtDNA levels of untreated wild-type cells, tetracycline treated wild-type cells and tetracycline induced TIR expressing cells to exclude effects of tetracycline as well as the expression of TIR on mtDNA.

      Minor comments:

      1.7. p. 3, line 71: "ten thousands of dividing cells.." should be "tens of thousands of dividing cells".

      Thank you, will correct.

      1.8.-p.4, line 116: please be even more clear with what the "depleted" cells and controls are treated with: are depleted cells treated with both inducers, and controls with neither?

      We will make this more clear. Depleted cells are treated with both inducers, the control cells are not. However, in Figure 1A and in S1 we do controls to show that inducing TIR per se or adding aTC per se does not change growth rate or mitochondrial morphology. We will make this more clear.

      1.9. -p.5, lines 147-148: the authors write "the rate with which the abundance of Cox2 and Var1 proteins decreases was similar to the rate of mtDNA loss" though the actual rate is not shown. Please calculate and show rates for these processes side by side to make comparison possible, or alternatively rephrase the statement.

      Indeed this was not phrased well. We will call it dynamics rather than rates.

      1.10. -Fig. 2d: changing the y-axis numbering to match those in panels a and b would facilitate comparisons.

      Makes sense, we will change this.

      1.11. Fig. 2e: it is recommended to label the western blot panels to indicate what protein is being imaged in each (Neongree,, Mrps5, Qcr7).

      We will adapt the labelling to make it more clear.

      1.12. -p.9, line 262: I suggest referencing Fig. 4e at the end of the first sentence for clarity.

      We will modify the sentence as suggested.

      1.13. -In the sections related to Fig. 3a and Fig. 5a as well as the connected supplemental data, the authors discuss both the median and the mean of mitochondrial mass and Atp6 protein, respectively. For purposes of clarity, I suggest decreasing the focus on the mean (that is provided only in the supplemental data) and focusing the text mainly on the median. The two show differing trends and it is very good that both are shown, but the clarity of the text can be improved by focusing more on the median where possible.

      We will check the phrasing and simplify.

      1.14. -p. 14, line 435: the statement that mt mass is maintained over the first 9h of depletion is only true for the mean mt mass, not for the median. Please make this clear or rephrase.

      We will check phrasing, make it more clear and also point out the extended proteomics data (see Fig R1), which corresponds to the mean of the populations

      1.15.-p.14, line 452: "mitofusions" should be "mitofusins".

      Thanks for catching this.

      Reviewer 2:

      2.1. While inducible TIR is used to reduce background, the manuscript should rigorously exclude auxin/TIR off-targets (growth, mitochondrial phenotypes, gene expression). Please include full matched controls: (plus minus)auxin, (plus minus)TIR, epitope tag alone, and a degron control on an unrelated mitochondrial membrane protein.

      We agree that rigorous controls are crucial for the interpretation of the results. However, we think we have already included most of the controls the reviewer is asking for, but we might have not pointed this out clearly enough. For example, in Fig 1A, we could make it more clear by adding more labels in which samples we added aTC, which is only described in the figure legend.

      Here is a list of all the controls:

      • Each depletion experiment is always matched with an experiment of the same strain without induction. So the genetic background as well as effects such as light exposure, time spent in the microfluidics systems, etc are controlled for.
      • Figure S1D shows that the growth rate is wildtype like in a strain containing either the AID tag or the TIR protein AND upon addition of both chemicals. It also shows that the final genetic background (AID-tag and TIR) also grows like wildtype if the inducers are not added. This conclusively shows that neither the tags/constructs nor the chemicals per se affect growth rate
      • In Figure S1C we show the mitochondrial morphology of the same controls. We will make sure to label them more consistently to match panel D, and include an actual wildtype and a FLAG-AID-Fzo1 strain without TIR treated with both aTC and 5-Ph-IAA as direct comparison
      • In figure 1A we compare the Fzo1 protein levels of a strain with and without TIR. We show that in absence of TIR, adding either aTC or Auxin does not change Fzo1 levels and that the levels are comparable in the strain that is able to deplete Fzo1 directly before addition of 5-Ph-IAA (after 2 h of induction of TIR through addition of tetracycline)
      • Additionally, in Figure S2C we show that two hours after adding aTC, the entire proteome does not change significantly apart from a strong induction of TIR. We can also make this more clear in the figure legend.
      • Additionally, we will run a qPCR to carefully determine mtDNA levels of untreated wild-type cells, tetracycline treated wild-type cells and tetracycline induced TIR expressing cells to exclude effects of tetracycline as well as the expression of TIR on mtDNA. (also in response to 1.6.) In summary, we think we have controlled sufficiently for all confounding parameters and most importantly showed that addition of either aTC or Auxin as well as the FLAG-AID tag per se does not disturb mitochondria or cell growth. We do not see what a degron control on an unrelated protein will tell us. Depending on the nature of the protein, it may or may not have a phenotype that may or may not be related to morphology changes etc.

      2.2. The Mitoloc preSu9 vs Cox4 import ratio is only a proxy of mitochondrial membrane potential (ΔΨm) and itself depends on mitochondrial mass, protein expression, matrix ATP, and import saturation. The authors need to calibrate ΔΨm with orthogonal dyes (TMRE/TMRM) and pharmacologic titrations (FCCP/antimycin/oligomycin) to generate a response curve; show that Mitoloc tracks dye-based ΔΨm across the relevant range and corrects for mass/photobleaching. Report single-cell ΔΨm vs mass residuals.

      We completely agree that the MitoLoc system is only a rough proxy for the actual membrane potential. That is why we make no quantitative claims on the absolute value or absolute difference between groups of cells. We also make very clear in Fig 3B what we are actually measuring and can emphasize again in the text that this is only a proxy. We agree that it is a good idea to compare MitoLoc values to TMRE staining as the reviewer suggests, we will do these experiments in depleted and control cells at different timepoints. Please note though that also dye staining has its caveats, especially in dynamic live cell experiments. TMRM for example is not compatible with the acidic pH 5 medium that is typically used for yeast and subjecting cells to washing steps and higher pH may change both morphology of mitochondria and the MMP, especially in cells that are already “stressed”. We prefer not to complete elaborate pharmacological titration experiments because firstly, this was extensively done in the original MitoLoc paper by the Ralser lab ((Vowinckel et al. 2015), cited 120 times); secondly, the value of the MMP is not the most critical claim of the manuscript. See also 3.12. Please note that in Figure S4D we had already plotted MMP vs mitochondrial concentration.

      2.3. To use Atp6-mNeon as a proxy for mtDNA is an assumption. Interpreting Atp6 intensity as "functional mtDNA" could be confounded by translation, turnover, or assembly. Please (i) report mtDNA copy number time courses (you have qPCR), nucleoid counts (DAPI/PicoGreen or TFAM/Abf2 tagging), and (ii) assess translation (e.g., 35S-labeling or puromycin proxies) and turnover (proteasome/AAA protease inhibition, mitophagy mutants -some data are alluded to- plus mRNA levels for mtDNA-encoded genes). This will support the "reduced synthesis" versus "increased degradation" conclusion.

      We agree with all three reviewers that Atp6 is only a proxy for mtDNA (Jakubke et al. 2021; Roussou et al. 2024) and the correlation should be checked more carefully. We will use the very recently established Hi-NESS system to follow nucleoids/ mtDNA during depletion experiments. See detailed reply to 1.2.

      (ii) in Figure 2C we inhibit mitochondrial translation and show that in this case control and depleted cells have the same level of Cox2, at least suggesting that degradation is not the key mechanism controlling the levels of mtDNA encoded proteins. We cannot do proteasome inhibitor assays since the nature of the AID-TIR systems requires an active proteasome. In figure S5C we show that the Atp6 depletion is similar in an atg32 deletion. This does not completely exclude a contribution of mitophagy to the observed phenotype, but does confirm that mitophagy is not the primary reason for cells becoming petite.

      2.4. The promoter-NeonGreen reporters argue against transcriptional down-regulation of nuclear OXPHOS. Please add mRNA (RT-qPCR/RNA-seq) for representative genes and a pulse-chase or degradation-pathway dependency (e.g., proteasome/mitophagy/autophagy mutants) to firmly assign active degradation. The authors need to normalize proteomics to mitochondrial mass (e.g., citrate synthase/porin) to separate organelle abundance from protein turnover.

      While we are happy to perform qPCR experiments for selected genes, a full RNA-seq experiment seems outside the scope of this study. As explained above, a proteasome inhibitor experiment is not possible in this set-up. Bulk mitophagy/autophagy seems unlikely to be the cause of the decrease of the nuclear-encoded OXPHOS proteins, since most other mitochondrial proteins do not decrease on average on population level in the first hours. This data is now plotted as additional figure (see below) and will be included in the supplementary of the revised manuscript (Fig R1E).

      2.5. Using preSu9-mCardinal intensity as "mitochondrial concentration" is sensitive to expression, import competence, and morphology/segmentation. The authors should provide validation that this metric tracks 3D volume across fragmentation states (e.g., correlation with mito-GFP volumetrics; detergent-free CS activity; TOMM20/Por1 immunoblot per cell).

      We agree that this is an important point and the co-authors discussed this point quite intensively. In figure S3A and B we show (using confocal data) that there is a very strong correlation between the total fluorescence signal and the 3D volume reconstruction. However, the slope of the correlation is different between tubular and fragmented mitochondria (compare panels A and B) and see figure legend. Since we are dealing with diffraction-limited objects it is likely that the 3D reconstruction is sensitive to morphology, especially if mitochondria are “clumping”. We therefore think that the total fluorescence signal is actually a better estimate of mitochondrial mass per cell than the 3D volume reconstruction (especially for our data obtained with a conventional epifluorescence microscope). The mean of the total mitochondrial fluorescence also better matches the population average mitochondrial proteome (Fig R1E). To consolidate this assumption, we will additionally compare our data to a strain with Tom70-Neongreen and to MMP independent dyes.

      Notably, since the morphology is similarly altered in mothers and buds this is of minor impact for our main point – the unequal distribution between mother and buds.

      2.6. The unequal mother-daughter distribution is compelling, but causality remains inferred. Test whether modulating inheritance machinery (actin cables/Myo2, Num1, Mmr1) or altering fission (Dnm1 inhibition) modifies segregation defects and rescues mtDNA/Atp6 decline. Complementation with Fzo1 re-expression at defined times would help order the phenotype cascade.

      We agree that rescue experiments would be very useful. We have some preliminary data for tether experiments, for example with Num1. The general problem is that the fragmented mitochondria clump together. We have not found a method to restore an equal distribution between mother and daughter cells. We will try to optimize the assay, but are not overly confident it will work. Mmr1 deletion aggravates the Fzo1 phenotype, likely also because the distribution becomes even more heterogeneous, but we have not rigorously analyzed this.

      We like the idea of the Fzo1 re-expression and will run such experiments. This will be especially powerful in combination with the new HI-NESS mtDNA reporter. We may be able to track exactly when cells reach the point-of-no return and become petite. This will also help connecting our mathematical model more directly to the data.

      2.7. The model is useful but should include parameter sensitivity (segregation variance, synthesis slopes, initial nucleoid number) and prospective validation (e.g., predict rescue upon partial restoration of synthesis or inheritance, then test experimentally).

      We will refine our model to include the to-be-measured nucleoids/mtDNA values. We will include a parameter sensitivity analysis with the updated model.

      Reviewer 3:

      3.1. About the use of Atp6 as a good proxy for mtDNA content. This is assumed from l285 onwards, based on a previous publication. As the link is fairly central to part of the paper's arguments, and the system in this study is being perturbed in several different ways, a stronger argument or demonstration that this link remains intact (and unchanged, as it is used in comparisons) would seem important.

      We agree, see 1.2.

      3.2. About confounding variables and processes. The study does an admirable job of being transparent and attempting to control for the many different influences involved in the physical-genetic link. But some remain less clearly unpacked, including some I think could be quite important. For example, there is a lot of focus on mito concentration -- but given the phenotypes are changing the sizes of cells, do concentration changes come from volume changes, mito changes, or both? In "ruling out" mitophagy -- a potentially important (and intuitive) influence, the argument is not presented as directly as it could be and it's not completely clear that it can in fact be ruled out in this way. There are a couple of other instances which I've put in the smaller points below.

      Thank you for acknowledging our efforts to show transparent and well-controlled experiments! We address each of the specific points below.

      3.3. full genus name when it first appears

      We will add the full name.

      3.4. I may be wrong here, but I thought the petite phenotype more classically arises from mtDNA deletion mutations, not loss? The way this is phrased implies that mtDNA loss is [always] the cause. Whether I'm wrong on that point or not, the petite phenotype should be described and referenced.

      We can expand the text and cite additional relevant papers. The term “petite” refers to any strain that is respiratory incompetent and leads to small colonies (not necessarily small cells!) (Seel et al. 2023). This can be mutations or gene loss (fragments) on the mtDNA (these are called cytoplasmic petite), or chemically induced loss of mtDNA (e.g. EtBr), or mutations of nuclear genes required for respiration (these are termed nuclear petite; some nuclear petites show loss of mtDNA in addition to the mutation in the nuclear genome) (Contamine and Picard 2000).

      3.5. para starting l59 -- should mention for context that mitochondria in (healthy, wildtype) yeast are generally much more fused than in other organisms

      ok.

      3.6. Fig 1C -- very odd choice of y-axis range! either start at zero or ensure that the data fill as much vertical space of the plot as possible

      True, this was probably some formatting relic. We will adapt the axis to fill the full space. Most of our axes start at 0, but that doesn’t make so much sense here, since we consider the solidity in the control as “baseline”.

      3.7. "wild-type like more tubular mitochondria" reads rather awkwardly. "more tubular mitochondria (as in the wild-type)"?

      Thank you, sounds better.

      3.8. l106 -- imaging artefacts? are mitos fragmenting because of photo stress? -- this is mentioned in l577-8 in the Methods, but the data from the growth rate and MMP comparison isn't given -- an SI figure would be helpful here. It would be reassuring to know that mito morphology wasn't changing in response to phototoxicity too.

      In the methods we just briefly point out that we have done all our “due diligence” controls to check that we do not generate phototoxicity, something that we highlight in the cited review. We do not explicitly have a figure for this, but figure S1A shows that the solidity of the mitochondrial network in control cells stays the same over 9 hours, even though these cells are exposed to the same cultivation and imaging regime as the depleted cells. We will also add a picture of control cells after 9 h. In S1B we show that control cells containing TIR but no AID tag treated with both chemicals imaged over 9 hours also show the same solidity (~mitochondrial morphology) as untreated control. Also, the doubling times of cells grown in our imaging system (Fig R1B) are very similar to the shake flask (Fig R1A). All in all, we are very confident that our imaging settings did not impact our reported phenotypes.

      3.9. para l146 -- so this suggests mtDNA-encoded proteins have a very rapid turnover, O(hours) -- is this known/reasonable?

      Reference (Christiano et al. 2014) suggests that respiratory chain proteins are shorter lived than the average yeast protein. However, based on Figure 2C we think the dynamics mostly speak for a dilution by growth.

      3.10. section l189 -- it's hard to reason fully about these statistics of mitochondrial concentration given that the petite phenotype is fundamentally affecting overall cell volume. can we have details on the cell size distribution in parallel with these results? to put it another way -- how does mitochondrial *amount* per cell change?

      This is a good point. We report mostly on mitochondrial “concentrations” because we think this is what the cell actually cares about (mitochondrial activity in relationship to cytosolic activity). But we will include additional graphs on mitochondrial amount as well as size distributions (Fig R1C, related to Fig 4F). We can already point out that the size distribution of the population does not change much in the first hours. The “petite” phenotype refers to small colonies on growth medium with limited supply of a fermentable carbon source, not to smaller size of single cells.

      3.11. l199 the mean in Fig S3C certainly does change -- it increases, clearly relative both to control and to its initial value. rather than sweeping this under the carpet we should look in more detail to understand it (a consequence of the increased skew of the distribution)?

      This relates somewhat to the previous point. The increase in average concentration is not due to an increased amount in the population, but due to the fact that it is the small buds that get a very high amount of the mitochondria which “exaggerates” the asymmetric/heterogenous distribution. This will be clarified by the figures we mention in the point above.

      3.12. para line 206 -- this doesn't make it clear whether your MMP signal is integrated over all mitochondria in the cell, or normalised by mitochondrial content? this matters quite a lot for the interpretation if the distributions of mitochondrial content are changing. reading on, this is even more important for para line 222. Reading further on, there is an equation on l612 that gives a definition, but it doesn't really clarify (apologies if I'm misunderstanding).

      For each cell, we basically calculate the relative mitochondrial enrichment of the MMP sensitive vs the MMP insensitive pre-sequence.

      So, MMP= (total intensity of mitochondrial pre-Cox4 Neongreen/ total intensity of mitochondrial pre-Su9 Cardinal) / (total cytosolic pre-Cox4 Neongreen/ total cytosolic pre-Su9 Cardinal).

      We calculate this value for each cell, but we do not have the optical resolution to calculate it for individual mitochondrial fragments.

      Both constructs are driven by the same strong promoter, so transcription of the fluorophore should never limit the uptake. Also, in Figure 3D we compare control and depleted cells with similar total mitochondrial concentration, so the difference must be due to a different import of the two fluorophores, see also Fig S4D. The calculated “MMP” value is of course only a crude proxy for the actual membrane potential in millivolts and we do not want to make any claims on absolute values or quantitative differences. But essentially what we are interested in is “mitochondrial health/activity” and we think the system is good at reporting this. See also 2.2.

      3.13. l230 -- a point of personal interest -- low mito concentrations are connected to low "function" (MMP) and give extended division times -- this is interestingly exactly the model needed to reproduce observations in HeLa cells (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002416). That model went on to predict several aspects of downstream cellular behaviour -- it would be very interesting to see how compatible that picture (parameterised using HeLa observations) is with yeast!

      Thank you for pointing out your interesting paper, which we will include in our discussion. Another recent preprint about fission yeast (Chacko et al. 2025) also fits into this picture. Since you were kind enough to disclose your identity, we would be happy to discuss this further with you in person if we can maybe follow-up on this.

      3.14. l239 "less mitochondria" -- a bit tricky but I'd say "fewer mitochondria" or "less mitochondrial content"

      Thanks, we will think about how to best rephrase this, probably less mitochondrial content.

      3.15. Section l234 So here (and in Fig 4) the focus is on overall distributions of mitochondrial concentration in different cells (mother-to-be, mother, bud; gen 1, gen >1). But we've just seen that one effect of fzo1 is to broader the distribution of mitochondrial concentration across cells. Can't we look in more depth at the implications of this heterogeneity? For example in Fig 4F (which is cool) we look at the distribution of all fzo1 mothers-to-be, mothers, and buds. But this loses information about the provenance. For example, do mothers-to-be with extremely low mito concentrations just push everything to the bud, while mothers-to-be with high mito concentrations distribute things more evenly? It would seem very easy and very interesting to somehow subset the distribution of mothers-to-be by concentration and see how different subsets behave

      This is a good point. When analyzing the data, we pretty much plotted everything against everything and then chose the graphs that we think will best guide the reader through the story-line. We can make additional supplementary plots where we show the starting concentrations/amounts of the mother in relationship to the resulting split ratio at the end of the cycle (Fig R1D).

      3.16. l285 -- experimental design -- do we know that Atp6 will continue to be a good proxy for functional mtDNA in the face of the perturbations provided by Fzo1 depletion? Especially if there is impact on the expression of mitoribosomes, the relationship between mtDNA and Atp6 may look rather different in the mutant?

      This is actually our top-priority experiment now. We will use the HI-NESS system and possibly DAPI staining to make a more direct link to mtDNA/ nucleoid numbers, see 1.2.

      3.17. l290 -- ruled out mitophagy. This message could be much clearer. Comparing Fig S5C and Fig 3A side-by-side is a needlessly difficult task -- put Fig 3A into Fig S5. Then we see that when mitophagy is compromised, the distribution of mitochondrial concentration has a lower median and much lower upper quartile than in the mitophagy-equipped Fzo1 mutant? What is going on here? For a paper motivated by disentangling coupled mechanisms, this should be made clearer!

      Thanks for pointing this out. We can of course easily include the control in the corresponding figure. Compromising mitophagy is likely to generally affect mitochondrial health and turnover a little bit, independent of what is going on with Fzo1. The second evidence that speaks against large-scale mitophagy is the proteomics data: On population level the dynamics of the respiratory chain proteins are very different from those of other (nuclear encoded) mitochondrial proteins. We will add additional supplementary figures to make this more clear, see Fig R1E. Most mitochondrial proteins in the proteomics experiment stay constant in the first few hours, consistent with the imaging data showing that the mean mitochondrial content of the population does not change initially. This again highlights that it is the unequal distribution which is the problem and not massive degradation of mitochondria.

      3.18. With the Atp6 signal, how do we know that fluorescence from different cells is comparable? Buds will be smaller than mother cells for example, potentially leading to less occlusion of the fluorescent signal by other content in the cytoplasm

      This is of course a general problem that anyone faces doing quantitative fluorescence microscopy. From the technical side, we have done the best we could by taking a reasonable amount of z-slices and by choosing fluorophores that are in a range with little cellular background fluorescence (e.g. Neongreen is much better than GFP). From a practical standpoint, we are always comparing to the control, which is subject to the same technical limitations as the depleted cells and the cell sizes are very similar. So, even if we are systematically overestimating the Atp6 concentration in the bud by a few %, the difference to the control would still be qualitatively true. We therefore do not think that any of our conclusions are affected by this.

      3.19. l343 -- maintenance of mtDNA -- here the point about l285 (is the Atp6-mtDNA relationship the same in the Fzo1 mutant) is particularly important, as we're directly tying findings about the protein product to implications about the mtDNA

      We will carefully address this, see above.

      3.20. l367 -- on a first read this description of the model feels like lots of choices have been made without being fully justified. Why a log-normal distribution (when the fit to the data looks rather flawed); why the choice of 5 groups for nucleoid number (why not 3? or 8?); the process used for parameter fitting is very unclear (after reading the methods I think some of these values are read directly from the data, but the shapes of the distributions remain unexplained). l705 -- presumably the ratio was drawn from a log-normal distribution and then the corresponding nucleoid numbers were rounded to integers? the ratio itself wasn't rounded? (also l367) How were the log-normal distributions fitted to experiments (Figs. S7A,B)? Just by eye?

      We will update our model based on measured nucleoid counts and then explain more stringently the choices we make/ parameters we select.

      3.21. l711 by random selection -- just at random? ("selection" could be confusing) Overall, it feels like the model may be too complicated for what it needs to show. Either (a) the model should show qualitatively that unequal inheritance and reduced production leads to rapid loss -- which a much simpler model, probably just involving a couple of lines of algebra, could show. Or (b) the model should quantitatively reproduce the particular numerical observations from the experiments -- it's not totally clear that it does this (do the cell-cycle-based decay timescales in Fig 7 correspond to the hour-based decay timescales in other plots, for example). At the moment the model is at a (b) level of detail but it's only clear that it's reporting the (a) level of results.

      If the HI-NESS and Fzo1 re-addition experiments work as explained above, all parameters will have direct experimental data, and we should get much closer to (a).

      3.22. A lot of the discussion repeats the results; depending on editorial preferences some of this text could probably be pared back to focus on the literature connections and context.

      We will think about streamlining the discussion once some of the additional material alluded to above has been added.

      3.23. Data availability -- it looks like much of the data required to reproduce the results is not going to be made available. Images and proteomic data are promised, but the data associated with mitochondrial concentration and other features are not mentioned. For FAIR purposes all the data (including statistics from analysis of the images) should be published.

      We maybe didn’t phrase this clearly. All data will be made available. Where technically feasible, this will be directly accessible in a repository, otherwise by request to the corresponding author.

      On our OMERO server, we have deposited many TB of raw images as well as all the intermediate steps such as segmentation masks, and the csv files with all the extracted data for each cell (including background corrections etc). Additionally, we can include csvs with the data grouped in a way that we used to generate all the box blots etc. As of now, the OMERO data is unfortunately only available by requesting a personal guest login from our bioinformatics facility, but we were promised that with the next technical update there will be a public link available. The proteomics data and the model are already fully accessible. The raw western blot images with corresponding ponceau staining will be included with the final publication either as additional supplementary material or in whatever format matches the journal requirements.

      3.24 l660 -- can an overview of the EM protocol be given, to avoid having to buy the Mayer 2024 article?

      The cited paper is open access. But we can also include more details in our method section.

      References:

      Chacko, L. A., H. Nakaoka, R. Morris, W. Marshall, and V. Ananthanarayanan. 2025. 'Mitochondrial function regulates cell growth kinetics to actively maintain mitochondrial homeostasis', bioRxiv.

      Christiano, R., N. Nagaraj, F. Frohlich, and T. C. Walther. 2014. 'Global proteome turnover analyses of the Yeasts S. cerevisiae and S. pombe', Cell Rep, 9: 1959-65.

      Contamine, V., and M. Picard. 2000. 'Maintenance and integrity of the mitochondrial genome: a plethora of nuclear genes in the budding yeast', Microbiol Mol Biol Rev, 64: 281-315.

      Deng, Jingti, Lucy Swift, Mashiat Zaman, Fatemeh Shahhosseini, Abhishek Sharma, Daniela Bureik, Francesco Padovani, Alissa Benedikt, Amit Jaiswal, Craig Brideau, Savraj Grewal, Kurt M. Schmoller, Pina Colarusso, and Timothy E. Shutt. 2025. 'A novel genetic fluorescent reporter to visualize mitochondrial nucleoids', bioRxiv: 2023.10.23.563667.

      Di Bartolomeo, F., C. Malina, K. Campbell, M. Mormino, J. Fuchs, E. Vorontsov, C. M. Gustafsson, and J. Nielsen. 2020. 'Absolute yeast mitochondrial proteome quantification reveals trade-off between biosynthesis and energy generation during diauxic shift', Proc Natl Acad Sci U S A, 117: 7524-35.

      Ebert, A. C., N. L. Hepowit, T. A. Martinez, H. Vollmer, H. L. Singkhek, K. D. Frazier, S. A. Kantejeva, M. R. Patel, and J. A. MacGurn. 2025. 'Sphingolipid metabolism drives mitochondria remodeling during aging and oxidative stress', bioRxiv.

      Jakubke, C., R. Roussou, A. Maiser, C. Schug, F. Thoma, R. Bunk, D. Horl, H. Leonhardt, P. Walter, T. Klecker, and C. Osman. 2021. 'Cristae-dependent quality control of the mitochondrial genome', Sci Adv, 7: eabi8886.

      Khan, Abdul Haseeb, Xuefang Gu, Rutvik J. Patel, Prabha Chuphal, Matheus P. Viana, Aidan I. Brown, Brian M. Zid, and Tatsuhisa Tsuboi. 2024. 'Mitochondrial protein heterogeneity stems from the stochastic nature of co-translational protein targeting in cell senescence', Nature Communications, 15: 8274.

      Martin, J., K. Mahlke, and N. Pfanner. 1991. 'Role of an energized inner membrane in mitochondrial protein import. Delta psi drives the movement of presequences', J Biol Chem, 266: 18051-7.

      Osman, C., T. R. Noriega, V. Okreglak, J. C. Fung, and P. Walter. 2015. 'Integrity of the yeast mitochondrial genome, but not its distribution and inheritance, relies on mitochondrial fission and fusion', Proc Natl Acad Sci U S A, 112: E947-56.

      Perić, Matea, Peter Bou Dib, Sven Dennerlein, Marina Musa, Marina Rudan, Anita Lovrić, Andrea Nikolić, Ana Šarić, Sandra Sobočanec, Željka Mačak, Nuno Raimundo, and Anita Kriško. 2016. 'Crosstalk between cellular compartments protects against proteotoxicity and extends lifespan', Scientific Reports, 6: 28751.

      Roussou, Rodaria, Dirk Metzler, Francesco Padovani, Felix Thoma, Rebecca Schwarz, Boris Shraiman, Kurt M. Schmoller, and Christof Osman. 2024. 'Real-time assessment of mitochondrial DNA heteroplasmy dynamics at the single-cell level', The EMBO Journal, 43: 5340-59-59.

      Seel, A., F. Padovani, M. Mayer, A. Finster, D. Bureik, F. Thoma, C. Osman, T. Klecker, and K. M. Schmoller. 2023. 'Regulation with cell size ensures mitochondrial DNA homeostasis during cell growth', Nat Struct Mol Biol, 30: 1549-60.

      Vowinckel, J., J. Hartl, R. Butler, and M. Ralser. 2015. 'MitoLoc: A method for the simultaneous quantification of mitochondrial network morphology and membrane potential in single cells', Mitochondrion, 24: 77-86.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This article addresses the connection between perturbed mitochondrial structure and genetics in yeast. When mitochondrial fusion is compromised, what is the chain of causality -- the mechanism -- that leads to mtDNA populations becoming depleted? This is a fascinating question, linking physical cell biology to population genetics. I admire the philosophy of the research, acknowledging and attempt to control for the many possible confounding influences. The manuscript describes the context and the research tightly and digestibly; the figures illustrate the results in a clear and natural way.

      For transparency, I am Iain Johnston and I am happy for this review to be treated as public domain. To my eyes my most important shortcoming as a review is my relative lack of familiarity with the yeast fzo1 mutant; while I am familiar with analysis of yeast mito morphology and mtDNA segregation, a reviewer familiar with the nuances of this strain and its culture would be a useful complement.

      I have a few more general points and a collection of smaller points below that I believe might help make the story more robust.

      General points

      1. About the use of Atp6 as a good proxy for mtDNA content. This is assumed from l285 onwards, based on a previous publication. As the link is fairly central to part of the paper's arguments, and the system in this study is being perturbed in several different ways, a stronger argument or demonstration that this link remains intact (and unchanged, as it is used in comparisons) would seem important.
      2. About confounding variables and processes. The study does an admirable job of being transparent and attempting to control for the many different influences involved in the physical-genetic link. But some remain less clearly unpacked, including some I think could be quite important. For example, there is a lot of focus on mito concentration -- but given the phenotypes are changing the sizes of cells, do concentration changes come from volume changes, mito changes, or both? In "ruling out" mitophagy -- a potentially important (and intuitive) influence, the argument is not presented as directly as it could be and it's not completely clear that it can in fact be ruled out in this way. There are a couple of other instances which I've put in the smaller points below.

      Smaller points

      l47 full genus name when it first appears

      l58 I may be wrong here, but I thought the petite phenotype more classically arises from mtDNA deletion mutations, not loss? The way this is phrased implies that mtDNA loss is [always] the cause. Whether I'm wrong on that point or not, the petite phenotype should be described and referenced.

      para starting l59 -- should mention for context that mitochondria in (healthy, wildtype) yeast are generally much more fused than in other organisms

      Fig 1C -- very odd choice of y-axis range! either start at zero or ensure that the data fill as much vertical space of the plot as possible

      l105 "wild-type like more tubular mitochondria" reads rather awkwardly. "more tubular mitochondria (as in the wild-type)"?

      l106 -- imaging artefacts? are mitos fragmenting because of photo stress? -- this is mentioned in l577-8 in the Methods, but the data from the growth rate and MMP comparison isn't given -- an SI figure would be helpful here. It would be reassuring to know that mito morphology wasn't changing in response to phototoxicity too.

      para l146 -- so this suggests mtDNA-encoded proteins have a very rapid turnover, O(hours) -- is this known/reasonable?

      section l189 -- it's hard to reason fully about these statistics of mitochondrial concentration given that the petite phenotype is fundamentally affecting overall cell volume. can we have details on the cell size distribution in parallel with these results? to put it another way -- how does mitochondrial amount per cell change?

      l199 the mean in Fig S3C certainly does change -- it increases, clearly relative both to control and to its initial value. rather than sweeping this under the carpet we should look in more detail to understand it (a consequence of the increased skew of the distribution)?

      para line 206 -- this doesn't make it clear whether your MMP signal is integrated over all mitochondria in the cell, or normalised by mitochondrial content? this matters quite a lot for the intepretation if the distributions of mitochondrial content are changing. reading on, this is even more important for para line 222. Reading further on, there is an equation on l612 that gives a definition, but it doesn't really clarify (apologies if I'm misunderstanding).

      l230 -- a point of personal interest -- low mito concentrations are connected to low "function" (MMP) and give extended division times -- this is interestingly exactly the model needed to reproduce observations in HeLa cells (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002416). That model went on to predict several aspects of downstream cellular behaviour -- it would be very interesting to see how compatible that picture (parameterised using HeLa observations) is with yeast!

      l239 "less mitochondria" -- a bit tricky but I'd say "fewer mitochondria" or "less mitochondrial content"

      Section l234 So here (and in Fig 4) the focus is on overall distributions of mitochondrial concentration in different cells (mother-to-be, mother, bud; gen 1, gen >1). But we've just seen that one effect of fzo1 is to broader the distribution of mitochondrial concentration across cells. Can't we look in more depth at the implications of this heterogeneity? For example in Fig 4F (which is cool) we look at the distribution of all fzo1 mothers-to-be, mothers, and buds. But this loses information about the provenance. For example, do mothers-to-be with extremely low mito concentrations just push everything to the bud, while mothers-to-be with high mito concentrations distribute things more evenly? It would seem very easy and very interesting to somehow subset the distribution of mothers-to-be by concentration and see how different subsets behave

      l285 -- experimental design -- do we know that Atp6 will continue to be a good proxy for functional mtDNA in the face of the perturbations provided by Fzo1 depletion? Especially if there is impact on the expression of mitoribosomes, the relationship between mtDNA and Atp6 may look rather different in the mutant?

      l290 -- ruled out mitophagy. This message could be much clearer. Comparing Fig S5C and Fig 3A side-by-side is a needlessly difficult task -- put Fig 3A into Fig S5. Then we see that when mitophagy is compromised, the distribution of mitochondrial concentration has a lower median and much lower upper quartile than in the mitophagy-equipped Fzo1 mutant? What is going on here? For a paper motivated by disentagling coupled mechanisms, this should be made clearer!

      With the Atp6 signal, how do we know that fluorescence from different cells is comparable? Buds will be smaller than mother cells for example, potentially leading to less occlusion of the fluorescent signal by other content in the cytoplasm

      l336 -- similar to the Jajoo et al. mechanism in fission yeast -- but are you talking about feedback control of the mtDNA or the protein (or mRNA) product?

      l343 -- maintenance of mtDNA -- here the point about l285 (is the Atp6-mtDNA relationship the same in the Fzo1 mutant) is particularly important, as we're directly tying findings about the protein product to implications about the mtDNA

      l367 -- on a first read this description of the model feels like lots of choices have been made without being fully justified. Why a log-normal distribution (when the fit to the data looks rather flawed); why the choice of 5 groups for nucleoid number (why not 3? or 8?); the process used for parameter fitting is very unclear (after reading the methods I think some of these values are read directly from the data, but the shapes of the distributions remain unexplained). l705 -- presumably the ratio was drawn from a log-normal distribution and then the corresponding nucleoid numbers were rounded to integers? the ratio itself wasn't rounded? (also l367) How were the log-normal distributions fitted to experiments (Figs. S7A,B)? Just by eye? l711 by random selection -- just at random? ("selection" could be confusing) Overall, it feels like the model may be too complicated for what it needs to show. Either (a) the model should show qualitatively that unequal inheritance and reduced production leads to rapid loss -- which a much simpler model, probably just involving a couple of lines of algebra, could show. Or (b) the model should quantitatively reproduce the particular numerical observations from the experiments -- it's not totally clear that it does this (do the cell-cycle-based decay timescales in Fig 7 correspond to the hour-based decay timescales in other plots, for example). At the moment the model is at a (b) level of detail but it's only clear that it's reporting the (a) level of results.

      A lot of the discussion repeats the results; depending on editorial preferences some of this text could probably be pared back to focus on the literature connections and context.

      Data availability -- it looks like much of the data required to reproduce the results is not going to be made available. Images and proteomic data are promised, but the data associated with mitochondrial concentration and other features are not mentioned. For FAIR purposes all the data (including statistics from analysis of the images) should be published.

      l660 -- can an overview of the EM protocol be given, to avoid having to buy the Mayer 2024 article?

      Significance

      This is a powerful and thoughtful study that provides a collection of new mechanistic insights into the link between physical and genetic properties of mitochondria in yeast. Cell biologists, geneticists, and the mitochondrial field will find this of potentially deep interest. Because of the mode and dynamics of inheritance in budding yeast, findings here may not be directly transferrable to other eukaryotes, but these insights are still of interest for researchers outside of yeast for their insight into how this well-studied system manages its mitochondrial populations.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Dengler and colleagues use an AID-based acute depletion of Fzo1 in budding yeast, coupling microfluidics live imaging, single-cell quantification (>30k cells), proteomics, an mtDNA-encoded Atp6 reporter, and simple modeling to argue that fusion loss causes (i) rapid fragmentation and ΔΨm decline, (ii) progressive mtDNA/RC depletion, and (iii) unequal mother-daughter mitochondrial inheritance; together with a failure of compensatory synthesis, these changes drive petite formation. The time-resolved design is valuable, but several readouts are indirect, and some claims (particularly those regarding membrane potential, synthesis "failure," and causality) appear over-interpreted without additional controls.

      Major points

      1. While inducible TIR is used to reduce background, the manuscript should rigorously exclude auxin/TIR off-targets (growth, mitochondrial phenotypes, gene expression). Please include full matched controls: {plus minus}auxin, {plus minus}TIR, epitope tag alone, and a degron control on an unrelated mitochondrial membrane protein.
      2. The Mitoloc preSu9 vs Cox4 import ratio is only a proxy of mitochondrial membrane potential (ΔΨm) and itself depends on mitochondrial mass, protein expression, matrix ATP, and import saturation. The authors need to calibrate ΔΨm with orthogonal dyes (TMRE/TMRM) and pharmacologic titrations (FCCP/antimycin/oligomycin) to generate a response curve; show that Mitoloc tracks dye-based ΔΨm across the relevant range and corrects for mass/photobleaching. Report single-cell ΔΨm vs mass residuals.
      3. To use Atp6-mNeon as a proxy for mtDNA is an assumption. Interpreting Atp6 intensity as "functional mtDNA" could be confounded by translation, turnover, or assembly. Please (i) report mtDNA copy number time courses (you have qPCR), nucleoid counts (DAPI/PicoGreen or TFAM/Abf2 tagging), and (ii) assess translation (e.g., 35S-labeling or puromycin proxies) and turnover (proteasome/AAA protease inhibition, mitophagy mutants -some data are alluded to- plus mRNA levels for mtDNA-encoded genes). This will support the "reduced synthesis" versus "increased degradation" conclusion.
      4. The promoter-NeonGreen reporters argue against transcriptional down-regulation of nuclear OXPHOS. Please add mRNA (RT-qPCR/RNA-seq) for representative genes and a pulse-chase or degradation-pathway dependency (e.g., proteasome/mitophagy/autophagy mutants) to firmly assign active degradation. The authors need to normalize proteomics to mitochondrial mass (e.g., citrate synthase/porin) to separate organelle abundance from protein turnover.
      5. Using preSu9-mCardinal intensity as "mitochondrial concentration" is sensitive to expression, import competence, and morphology/segmentation. The authors should provide validation that this metric tracks 3D volume across fragmentation states (e.g., correlation with mito-GFP volumetrics; detergent-free CS activity; TOMM20/Por1 immunoblot per cell).
      6. The unequal mother-daughter distribution is compelling, but causality remains inferred. Test whether modulating inheritance machinery (actin cables/Myo2, Num1, Mmr1) or altering fission (Dnm1 inhibition) modifies segregation defects and rescues mtDNA/Atp6 decline. Complementation with Fzo1 re-expression at defined times would help order the phenotype cascade.
      7. The model is useful but should include parameter sensitivity (segregation variance, synthesis slopes, initial nucleoid number) and prospective validation (e.g., predict rescue upon partial restoration of synthesis or inheritance, then test experimentally).

      Significance

      The dataset is rich and the time-resolved approach strong, but key conclusions rely on indirect proxies and need orthogonal validation and at least one causal rescue experiment to avoid over-interpretation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript by Dengler et al examines the mechanisms underlying the mtDNA depletion observed in cells where mitochondrial fusion is disrupted by depletion of the fusion factor Fzo1. In Saccharomyces cerevisiae, the authors deplete Fzo1 and use live-cell imaging of thousands of cells to follow the effects and their dynamic following Fzo1 depletion. They find that Fzo1-depleted cells show very rapid mitochondrial fragmentation (within 1h of Fzo1 depletion), and also an immediate drop in mitochondrial membrane potential (MMP). MtDNA is lost by 15h, and along with it the expression of mitochondrially-encoded proteins. Nuclear-encoded mitochondrial proteins are also decreased though somewhat later, and the authors find that this is largely due to their degradation (probably a consequence of lack of mitochondrial import into low-MMP cells). Most importantly, the study identifies two separate mechanisms that together contribute to the loss of mt-encoded proteins in Fzo1-depleted cells: unequal distribution of mitochondria during cell division and the reduction of a fusion-dependent compensatory synthesis of mt-encoded proteins. Unexpectedly, Fzo1-depleted cells end up passing an increased (rather than decreased) amount of mitochondria and mitochondria-encoded proteins to their daughters. Over several generations, and combined with the loss of the compensatory synthesis of more mt-encoded proteins, this leads to the progressive loss of mtDNA and mtDNA-encoded proteins in the population.

      The study is very thorough and the experiments contain the appropriate controls. The conclusions are convincing and largely supported by the experimental data that has been appropriately replicated. The data presentation is generally clear although the text could benefit from some streamlining.

      However, addressing the following major comments is required:

      1. Statistical comparisons are missing throughout the manuscript (with the exception of Fig. 2c). Appropriate statistical tests, along with p-values, should be used and reported where different gorups are compared, for example (but not limited to) Fig. 3d and most panels of Fig. 4.
      2. I do not agree with the use of Atp6 protein as a direct read-out of mtDNA content. While Atp6 protein levels will decrease with decreasing mtDNA content, the inverse is not necessarily true: decreased Atp6 protein levels do not necessarily indicate decreased mtDNA levels, because they could alternatively or additionally be caused by decreased transcription and/or translation. Therefore, please do not equate Atp6 protein levels to mtDNA levels, and instead rephrase the text referencing the Atp6 experiments in the Results and Discussion sections to measure "mtDNA expression" or "mt-encoded protein" or similar. For example, on p. 14 line 431 should read "mtDNA expression" rather than "decreased synthesis of mtDNA", and line 440 on the same page "mean mtDNA levels" should be "mtDNA expression" or similar.
      3. In Fig. 3, the authors use the fluorescence intensity of a mitochondrially-targeted mCardinal as a read-out of mitochondrial mass. Please provide evidence that this is not affected by MMP, either with relevant references or by control experiments (e.g. comparing it to N-acridine orange or other MMP-independent dyes or methods).
      4. In Fig. 2e-f, the authors use a promoter reporter with Neongreen to answer whether the reduced levels of the nuclear-encoded mitochondrial proteins Mrps5 and Qcr7 are due to decreased expression or to protein degradation, and find no evidence of degradation of the Neongreen reporter protein. However, subcellular localization might affect the availability of the protein to proteases. Although not absolutely required, it would be relevant to know if the Neongreen fusion protein is found in the same subcellular compartment as Mrps5 and Qcr7 at 0h and 9h after Fzo1 depletion.
      5. Fzo1 depletion leads to a very rapid drop in MMP during the first hour of depletion. In the Discussion, can the authors speculate on the possible mechanism of this rapid MMP drop that occurs well before mtDNA or mt-encoded proteins are decreased in level?
      6. In Fig. 2a, the mtDNA copy number of Fzo1-depleted cells is ca 1.3-fold of the control cells at the 0h timepoint. Why might this be? Is it an impact of one of the inducers? If so, we might be looking at the combination of two different processes when measuring copy number: one that is an induction caused by the inducer(s), and the other a consequence of Fzo1 depletion itself.

      Minor comments:

      • p. 3, line 71: "ten thousands of dividing cells.." should be "tens of thousands of dividing cells".
      • p.4, line 116: please be even more clear with what the "depleted" cells and controls are treated with: are depleted cells treated with both inducers, and controls with neither?
      • p.5, lines 147-148: the authors write "the rate with which the abundance of Cox2 and Var1 proteins decreases was similar to the rate of mtDNA loss" though the actual rate is not shown. Please calculate and show rates for these processes side by side to make comparison possible, or alternatively rephrase the statement.
      • Fig. 2d: changing the y-axis numbering to match those in panels a and b would facilitate comparisons.
      • Fig. 2e: it is recommended to label the western blot panels to indicate what protein is being imaged in each (Neongree,, Mrps5, Qcr7).
      • p.9, line 262: I suggest referencing Fig. 4e at the end of the first sentence for clarity.
      • In the sections related to Fig. 3a and Fig. 5a as well as the connected supplemental data, the authors discuss both the median and the mean of mitochondrial mass and Atp6 protein, respectively. For purposes of clarity, I suggest decreasing the focus on the mean (that is provided only in the supplemental data) and focusing the text mainly on the median. The two show differing trends and it is very good that both are shown, but the clarity of the text can be improved by focusing more on the median where possible.
      • p. 14, line 435: the statement that mt mass is maintained over the first 9h of depletion is only true for the mean mt mass, not for the median. Please make this clear or rephrase.
      • p.14, line 452: "mitofusions" should be "mitofusins".

      Referees cross-commenting

      I think that the reviews of the other two reviewers are both insightful and constructive. Especially the rescue experiment suggested by Reviewer 2 could provide strong support for the interpretations of the study. Note that all three reviewers ask for validation of the use of Atp6p as a read-out of mtDNA function, and that all agree the data is powerful and the study of value to the field.

      Significance

      The fact that disruption of mt fusion leads to mtDNA loss has been known for some time, but the mechanism behind this phenomenon has remained unknown to date. This thorough and precise study by Dengler et al uses state-of-the-art single-cell analysis to dissect the mechanisms underlying the mtDNA loss following the disruption of mt fusion, and convincingly reveal that it is caused by two different mechanisms: i) the inequal inheritance of mitochondria between mother and bud, and ii) the loss of a compensatory mechanism that normally maintains homeostatic mt protein levels. In the process, the authors shed light on the dynamics of the events following Fzo1 depletion, revealing dramatically fast mt fragmentation and a loss of MMP, which in turn can be expected to act as a stress signal and influence a number of cellular processes.

      The findings of the study can have relevance for human conditions involving disrupted mitochondrial dynamics, caused for example by mutations in mitofusins. The study will be of interest to researchers in mitochondrial biology ranging from dynamics and mtDNA maintenance to mitochondrial medicine.

      The field of expertise of this reviewer: mtDNA maintenance. I am not able to properly evaluate the modelling in Fig. 7.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the reviewers

      We thank the reviewers for recognizing the importance of study, and how it “addresses a long-standing question in the heterogeneity of cellular responses to stressors”, “makes a conceptual advance by identifying transcription factors as the limiting determinant of IFN-β induction in KSHV-infected cells”, and “serves as a crucial starting point for understanding cellular heterogeneity”. We agree that our findings appeal to a broad audience interested in virology, immunology, cell biology, and gene transcription.

      We also thank the reviewers for their insightful suggestions that will greatly strengthen our study. Below we detail how we plan to address their comments experimentally and how we have already edited the text to respond to them.

      Referee #1

      One experiment that may provide some insight into the selective RelA activation is to quantify viral genomes within the high and low IFN producing cells. Perhaps, the genome as a PAMP, is more abundant in the inducing cells.

      We have added a note in the Discussion section (line 417) that we have evidence that the cGAS PAMP in our system is mitochondrial DNA, not viral DNA. Moreover, our results suggest that the variation in PAMP levels are not the source of heterogeneity, as this would cause heterogeneous activation of the cGAS-STING-TBK1-IRF3 axis. Instead, we have discovered that TBK1 and IRF3 are activated even in cells without interferon-β induction.

      Referee #2

      1) While the study presents intriguing evidence for AP-1 involvement in regulating IFN-β responses, the reliance on total c-Jun levels as a readout is limiting. Because c-Jun activity depends on phosphorylation and promoter binding, additional experiments (i.e., phospho-c-Jun analysis or ChIP at the IFNB1 promoter) would strengthen the link between AP-1 activity and the observed reporter outcomes.

      We agree that that a stronger link between AP-1 activity and IFN induction would improve our study, so we have cloned interferon-β reporter constructs that contain mutations in the AP-1 binding sites. We plan to use this reporter, as well as IFN-β reporter constructs that contain mutations in either the AP-1, IRF3, or NF-κB binding sites, to mechanistically test the connection between AP-1 and activation of the IFN promoter. As a control, we will test that the mutations block reporter induction after stimulation with a well characterized agonist of the IFN induction pathway such as poly(I:C). We have previously investigated c-Jun and ATF2 phosphorylation during KSHV reactivation and caspase inhibition. Surprisingly, in preliminary experiments we did not detect phosphorylation of either AP-1 subunit. We will confirm this result and add these data to the manuscript.

      2) The data presented demonstrating that Serine 386 phosphorylation does not distinguish first responder cells is strong. Including complementary data on Ser396 phosphorylation would strengthen the conclusion, as this well-established activation marker is readily detectable with available reagents and would help confirm that the potentiation of IRF3 activity is not the driver of the observed heterogeneity.

      We will complement the Ser386 results with Ser396 staining.

      3) Consider updating the title to more directly reflect the findings (e.g. "Interferon-β induction heterogeneity during KSHV infection is correlated to expression of ATF2 and RelA")

      We have updated the title to “Interferon-β induction heterogeneity during KSHV infection is correlated to levels and activation of the transcription factors ATF2 and RelA, and not IRF3”

      *4) To ease the interpretation of data, indicate what the black and white circles indicate in the figure legends. *

      We have updated the figures to be more intuitive, using + and -.

      5) IE ORF50 is used to show no differences between first responders and non-responders, but showing early and late genes across tdTomato positive and negative cells would rule out potential differences in progression through reactivation.

      We added a clarification in the Results section (line 195), explaining that we have examined the progression of viral reactivation through single-cell transcriptomics in our previous publication, and that the results indicate that viral gene expression plays a small role in interferon-β heterogeneity. We favor the scRNA-seq dataset for this conclusion, because the tdTomato negative cell population represents a mix of non-reactivating cells, which would not be expected to make IFN, and reactivating cells that fail to induce IFN expression.

      6) The data in Figure 5D (quantified in E and F) show a compelling trend. This could be further clarified by plotting a trend line that connects the results of independent experiments, rather than only showing individual data points. Such visualization would make the consistency of the observed trend across experiments more apparent.

      We have added lines in the graphs in Figure 5 to ease visualization.

      Referee #3

      A major worry comes from using lentiviral transduction to insert the reporter promoter into cells without selecting for clones. Lentiviral transduction introduces heterogeneity due to random insertion of their vector. This results in different copy numbers for the reporter construct, leading to heterogeneity in the reporter expression. Additionally, the expression of foreign proteins, particularly in immune cells, can be perceived as danger signals (10.1007/s12015-016-9670-8) and occasionally trigger p65 activation. To control for this, the authors could validate their reporter results by including a non-IFNb promoter (e.g., constitutive) expressing tdTomato and verifying that these cell populations do not also express endogenous IFNb mRNAs.

      We did not select clonal cell lines because different cells may have different reactivation propensity. Moreover, we did not want the tdTomato signal to reflect specific regulation of a single genomic region. We have now added an explanation as to why we did not clonally select that cell lines in the Results section (line 157). Our control conditions that do not result in IFN-β induction show that lentiviral insertion is not sufficient to cause IFN induction, as we did not detect IFN-β mRNA in the untreated reporter cells (first bar in Figure 1C). We also clarified in the Results section (line 184) that the selective enrichment of both IFN-β and tdTomato mRNA in the sorted tdTomato+ cells demonstrates that tdTomato is a faithful reporter for rare IFN-β expression, regardless of heterogeneous lentiviral transduction in the population. To further verify that lentiviral transduction does not play a role in introducing heterogeneity in induction of our tdTomato reporter and of IFN-β, we will measure IFN-β levels in BC-3 cells constitutively expressing tdTomato, which we have already created. We may also sort BC-3 cells constitutively expressing tdTomato and check that the tdTomato signal is not predictive of IFN expression in these cells. However, the expectation is that all or most cells will be tdTomato positive, which may make sorting for tdTomato negative cells impossible.

      Regarding AP1 and NF-kB activation, the authors could investigate downstream genes such as GADD45B, HSPA1A, and ATF-3 (for AP1), and IL-6, TNFAIP3 (A20) (for both AP1 and NFkB). It would be interesting to determine if these genes are exclusively expressed in tdTomato-expressing cells.

      We will quantify the mRNA levels of these genes by performing qPCR on our cDNA from sort experiments. So far, we have detected IL-6 induction but no enrichment of this transcript in the sorted tdTomato+ samples.

      While the authors observed no direct correlation between c-Jun alone and IFN-b production, it is conceivable that TPA-induced c-Jun primes the cells that become fully transcriptionally active upon a stimulus like viral reactivation. I propose that the authors attempt to inhibit c-Jun activation during KSHV reactivation (TPA + caspase inhibitor) using inhibitors like SP600125 and subsequently assess whether this blockade reduces the proportion of IFNb+ cells.

      We have tried using the suggested inhibitor (SP600125), but found that it inhibits KSHV reactivation, making any result on IFN levels difficult to interpret. Currently, we are testing a dual AP-1 and NF-κB inhibitor (SP100030) and may add these data to the results if we do not encounter similar issues.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Kaku and Gaglia's study provides one step further in the ongoing debate surrounding viral versus innate immune heterogeneity. They addressed this question by creating a reporter B cell line (BC-3) that expresses tdTomato under the IFNb promoter. This particular cell line is known to be latently infected with KSHV, and lytic infection can be induced by treatment with the protein kinase C activator TPA.

      Through FACS sorting upon KSHV lytic infection, the authors observed a correlation between the promoter reporter activation and endogenous IFNb and IFNl mRNA levels. However, this correlation did not extend to viral replication, indicating that only a fraction of virus-infected cells produce IFN. They identified these cells as "first responders" upon viral replication by treating them with anti-IFNAR, confirming that IFN production is triggered by the cGAS-STING pathway sensing lytic virus infection.

      Surprisingly, p-IRF3 activation was not limited to IFN-producing cells, suggesting the involvement of other key transcription factors. Indeed, they found a correlation between NF-kB and AP1 activation and IFN production. The study concludes that the combined action of NF-kB, AP1, and IRF3 is crucial for robust IFN production.

      Major comment

      The author effectively dissects the necessary components for IFNb activation, despite acknowledging the limitations of their findings. All my potential anecdotal queries, such as the role of other viruses or agonists and the treatment of cells with NF-kB inhibitors, are thoroughly addressed in their discussion.

      However, a major worry comes from using lentiviral transduction to insert the reporter promoter into cells without selecting for clones. Lentiviral transduction introduces heterogeneity due to random insertion of their vector. This results in different copy numbers for the reporter construct, leading to heterogeneity in the reporter expression. Additionally, the expression of foreign proteins, particularly in immune cells, can be perceived as danger signals (10.1007/s12015-016-9670-8) and occasionally trigger p65 activation. To control for this, the authors could validate their reporter results by including a non-IFNb promoter (e.g., constitutive) expressing tdTomato and verifying that these cell populations do not also express endogenous IFNb mRNAs.

      Minor comments

      Regarding AP1 and NF-kB activation, the authors could investigate downstream genes such as GADD45B, HSPA1A, and ATF-3 (for AP1), and IL-6, TNFAIP3 (A20) (for both AP1 and NFkB). It would be interesting to determine if these genes are exclusively expressed in tdTomato-expressing cells.

      While the authors observed no direct correlation between c-Jun alone and IFN-b production, it is conceivable that TPA-induced c-Jun primes the cells that become fully transcriptionally active upon a stimulus like viral reactivation. I propose that the authors attempt to inhibit c-Jun activation during KSHV reactivation (TPA + caspase inhibitor) using inhibitors like SP600125 and subsequently assess whether this blockade reduces the proportion of IFNb+ cells.

      Significance

      The study presents a valuable dataset and serves as a crucial starting point for understanding cellular heterogeneity, particularly regarding the known concept of IRF+NFkB in IFNb production. While this mechanism isn't novel (10.1074/jbc.273.5.2714), the authors demonstrated the difference in activation in a cellular level. This finding can be the basis of future research utilizing more physiologically relevant models, such as primary cells or tissues, to identify factors contributing to varying cellular responses.

      However, the authors acknowledge that these findings should be interpreted with caution and require further validation through additional studies across different models and viral infections. This research will be particularly relevant to those in basic research seeking to deepen their understanding of the dynamic differences in innate immune responses and viral infections.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript "Interferon-β induction heterogeneity during KSHV infection is correlated to expression and activation of enhanceosome transcription factors other than IRF3", Kaku and Gaglia address a long-standing question in the initiation of host antiviral responses: what drives the heterogeneity in the initiation of IFN responses within a presumably homogenous population of cells. In this study, the authors focus on host factors that contribute to the heterogeneity in IFN induction. They use a KSHV lytic reactivation model (TPA + Caspase inhibitor treated BC-3 cells) and FACS-based reporter assays to enhance the resolution in the detection of molecular drivers of "first responder" cells that make IFN. They find that IRF3 activation alone does not predict IFN expression; rather, the expression of ATF2 and RelA is predictive of IFN-β induction. The authors carefully control for off-target effects of TPA treatment in BJAB cells and paracrine signaling through the inclusion of IFN-neutralizing antibodies. Overall, the manuscript is well-written and easy to follow, and the data compellingly support their conclusion that cell-specific transcription factor activity limits IFN production to cell subsets. Demonstrating coordinated occupancy or functional interplay of these factors would increase confidence in the proposed model and broaden the impact for readers interested in virology, immunology, and transcriptional regulation.

      Comments:

      1. While the study presents intriguing evidence for AP-1 involvement in regulating IFN-β responses, the reliance on total c-Jun levels as a readout is limiting. Because c-Jun activity depends on phosphorylation and promoter binding, additional experiments (i.e., phospho-c-Jun analysis or ChIP at the IFNB1 promoter) would strengthen the link between AP-1 activity and the observed reporter outcomes.
      2. The data presented demonstrating that Serine 386 phosphorylation does not distinguish first responder cells is strong. Including complementary data on Ser396 phosphorylation would strengthen the conclusion, as this well-established activation marker is readily detectable with available reagents and would help confirm that the potentiation of IRF3 activity is not the driver of the observed heterogeneity.

      Minor Comments:

      1. Consider updating the title to more directly reflect the findings (e.g. "Interferon-β induction heterogeneity during KSHV infection is correlated to expression of ATF2 and RelA".
      2. To ease the interpretation of data, indicate what the black and white circles indicate in the figure legends.
      3. The authors predominantly rely on the IE gene ORF50 as a marker of KSHV reactivation and show no differences in expression between first responder cells and those that don't. Measurement of early and late genes across TdTomato-positive and negative cells would rule out potential differences in progression through reactivation that might influence IFN production.
      4. The data in Figure 5D (quantified in E and F) show a compelling trend. This could be further clarified by plotting a trend line that connects the results of independent experiments, rather than only showing individual data points. Such visualization would make the consistency of the observed trend across experiments more apparent.

      Significance

      This study addresses an important and long-stading question in the heterogeneity of cellular responses to stressors, such as viruses. The study is well designed and presented, making it appealling to a broad audience interested in virology, immunology, cell biology, and gene transcription.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript discusses why only a small proportion of KSHV infected cells produce high level of IFN during lytic reactivation in the presence of caspase inhibition, highlighting cellular heterogeneity as a key factor in innate immune regulation in KSHV infection. Using a dt-TOMATO reporter fused to the IFN promoter, the author generated a stable cell line in BC3 cells (a primary effusion lymphoma line). The authors observed that while IRF3 and TBK1 were activated in nearly all infected cells, only those with both high baseline levels of the AP-1 component ATF2 and activated NF-κB (phosphorylated RelA) produced robust FN. These findings suggest that AP-1 and NF-B, rather than IRF3, are the limiting factors for IFN induction in individual cells.

      Overall the findings are interesting and important. While there remain many unknowns, such as why RelA is activated in only a subset of cells, this manuscript takes us one step close to determining how IFN is ultimately induced in KSHV infection.

      One experiment that may provide some insight into the selective RelA activation is to quantify viral genomes within the high and low IFN producing cells. Perhaps, the genome as a PAMP, is more abundant in the inducing cells.

      Significance

      The study makes a conceptual advance by identifying transcription factors as the limiting determinant of IFN-β induction in KSHV-infected cells, highlighting how innate immune responses are regulated during herpesvirus infection and how the regulation influences viral persistence and immune evasion.

      The above findings will be of important to researchers studying herpesvirus biology, innate immunity (IFN signaling), and host-pathogen interaction.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the editor and the reviewers for their positive and constructive comments. Below is our point-by-point responses.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Metabolic dysfunction-associated steatotic liver disease (MASLD) ranges from simple steatosis, steatohepatitis, fibrosis/cirrhosis, and hepatocellular carcinoma. In the current study, the authors aimed to determine the early molecular signatures differentiating patients with MASLD associated fibrosis from those patients with early MASLD but no symptoms. The authors recruited 109 obese individuals before bariatric surgery. They separated the cohorts as no MASLD (without histological abnormalities) and MASLD. The liver samples were then subjected to transcriptomic and metabolomic analysis. The serum samples were subjected to metabolomic analysis. The authors identified dysregulated lipid metabolism, including glyceride lipids, in the liver samples of MASLD patients compared to the no MASLD ones. Circulating metabolomic changes in lipid profiles slightly correlated with MASLD, possibly due to the no MASLD samples derived from obese patients. Several genes involved in lipid droplet formation were also found elevated in MASLD patients. Besides, elevated levels of amino acids, which are possibly related to collagen synthesis, were observed in MASLD patients. Several antioxidant metabolites were increased in MASLD patients. Furthermore, dysregulated genes involved in mitochondrial function and autophagy were identified in MASLD patients, likely linking oxidative stress to MASLD progression. The authors then determined the representative gene signatures in the development of fibrosis by comparing this cohort with the other two published cohorts. Top enriched pathways in fibrotic patients included GTPase signaling and innate immune responses, suggesting the involvement of GTPase in MASLD progression to fibrosis. The authors then challenged human patient derived 3D spheroid system with a dual PPARa/d agonist and found that this treatment restored the expression levels of GTPase-related genes in MASLD 3D spheroids. In conclusion, the authors suggested the involvement of upregulated GTPase-related genes during fibrosis initiation. Overall, the current study might provide some resources regarding transcriptomic and metabolomic data derived from obese patients with and without MASLD. However, several concerns should be carefully addressed.

      1. A recent study, via proteomic and transcriptomic analysis, revealed that four proteins (ADAMTSL2, AKR1B10, CFHR4 and TREM2) could be used to identify MASLD patients at risk of steatohepatitis (PMID: 37037945). It is not clear why the authors did not include this study in their comparison. Thank you for the suggestion. The RNA sequencing dataset (GSE135251) from study PMID 37037945 is the same dataset we used as an external benchmark in our study, referred to as the EU cohort on page 4 in the manuscript. In addition to PMID 37037945, we have cited the original transcriptomic study (PMID 33268509) for the EU cohort. In the revised manuscript, we discussed this proteome-transcriptome paper in the Discussion section and highlighted the potential of AKR1B10 as a biomarker in early MASLD.

      The authors recruited 109 patients but only performed transcriptomic and metabolomic analysis in 94 liver samples. Why did the authors exclude other samples?

      We thank the reviewer for their question and we understand the confusion. The discrepancy in sample size between liver and plasma cohorts is due to the fact that, for certain cases, we were unable to get sufficient liver tissue slices (“Exclusion criteria included: age The authors mentioned clinical data in Table 1 but did not present the table in this manuscript.

      Table 1 (key patient characteristics) was included in the main document after the Methods section, and Table S1 (additional patient characteristics) was provided as a supplemental file in our original submission.

      The generated metabolomic data could be a very useful resource to the MASLD community. However, it is very confusing how the data was generated in those supplemental tables. There is no clear labeling of human clinical information in those tables. Also, what do those values mean in columns 47-154? This reviewer assumed that they are the raw data of metabolomic analysis in plasma samples. However, without clear clinical information in these patients, it is impossible that any scientist can use the data to reproduce the authors' findings.

      We appreciate this suggestion. To ensure accessibility of the data resources, we created a GitHub repository for both data and code, available at https://github.com/SLINGhub/MASLD_dual_omics____.

      The GitHub repository includes clinical data for all 109 participants with patient characteristics and histological gradings, as well as processed omics data (log₂-transformed). We have generated artificial IDs for each patient so that we can include all the requested data in an organized manner. A code template is also provided to replicate the main statistical results from this study. In addition, for readers interested in conducting analyses from the raw data, we have deposited the raw sequencing files and mass spectrometry data in GEO and Zenodo, as detailed in the ‘Data Availability’ section.

      In Fig. 5B, the authors excluded the steatosis and fibrosis overlapped genes. Steatosis and fibrosis specific genes could simply reflect the outcomes rather than causes. In this case, the obtained results might not identify the gene signatures related to fibrosis initiation.

      We appreciate this comment, but we do not fully understand the reviewer’s point since we did not exclude overlapped genes in our analysis, and it was unclear to us whether excluding overlapping genes has anything to do with causality of both processes.

      In Figure 5B, we identified the gene signatures associated with steatosis and fibrosis after adjusting for potential confounders such as age, sex, BMI and diabetes status. Our results showed that these signatures were relatively independent, sharing a limited number of genes. We then examined genes uniquely associated with each process by additional adjustment (e.g., adjusting steatosis models for fibrosis grades). To us this was not an unreasonable approach, given that steatosis precedes fibrosis in most cases, especially in morbid obesity.

      We nevertheless agree with the reviewer’s point that the gene expression changes we identified represent statistical associations without warranting causality. To specifically address fibrosis initiation mechanisms within the limitation of the current study design, we performed a separate comparative analysis between patients with fibrosis+steatosis versus those with steatosis alone (Table S11), which still identified GTPase regulation as a potential key mechanism in fibrosis initiation (Figure 6B).

      In Fig. 6D, the authors used 3D liver spheroid to validate their findings. However, there is no images showing the 3D liver spheroid formation before and after PPARa/d agonist treatment. It is not clear whether the 3D liver spheroid was successfully established.

      There is extensive literature (>40 papers) from the Lauschke lab on 3D liver spheroid culture, including but not limited to PMIDs 27143246, 28264975, 32775153, 37870288 and 39605182. Images of the spheroids can be seen in Figure 1c of Adv. Sci. 2024, 2407572 and elafibrinor treatment did not affect the morphology of the spheroids.

      The authors suggested that targeting LX-2 cells with Rac1 and Cdc42 inhibitors could reduce collagen production. Did the authors observe these two genes upregulated in mRNA and protein expression levels in their cohort when compared MASLD patients with and without fibrosis? Did the authors observe that the expression levels of Rac1 and Cdc42 are correlated with fibrosis progression in MASLD patients?

      Regarding comments 7 and 8, we targeted Rac1 and Cdc42 in the LX-2 cell experiment as they are common and major GTPases. Protein-level data are not available in our dataset, but we examined their transcript-level expression. RAC1 and CDC42 expression levels were positively associated with fibrosis progression, with coefficients of 0.362 (q = 0.027) and 0.342 (q = 0.031), respectively. These results are presented in Table S5, and the corresponding boxplots are shown here.

      Figure R1. RAC1 and CDC42 expression levels in individuals with different fibrosis *levels. *

      Other studies have revealed several metabolite changes related to MASLD progression (PMID: 35434590, PMID: 22364559). However, the authors did not discuss the discrepancies between their findings with the previous studies.

      Thank you for the suggestion. We have incorporated a discussion of the two studies into the Discussion section, highlighting the consistencies and discrepancies between our plasma metabolomic results and previous findings. The main differences may stem from variations in MASLD spectrum and the degree of obesity in the cohorts.

      Reviewer #1 (Significance (Required)):

      Overall, the current study might provide some new resources regarding transcriptomic and metabolomic data derived from obese patients with and without MASLD. The MASLD research community will be interested in the resource data.

      We thank this reviewer for the positive and constructive evaluation of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this paper, Kaldis and collaborators investigate the molecular heterogeneity of a 109 morbidly obese patient cohort, focusing on liver transcriptomics and metabolomics analysis from liver and serum. The main finding (i.e. upregulation of GTPase-coding genes) was validated in spheroids and a human HSC cell line. As these proteins are involved in critical cellular functions related to metabolism and cytoskeleton dynamics, these findings shed light on their involvement in human liver pathology which so far has been poorly (or even not) documented to date. This is an interesting addition to the current knowledge about chronic liver pathology. However the manuscript suffers from the lack of a clear-cut definition of patient subgroups and the seemingly indistinct use of generic (MASLD, NAS score) and more granular terms (MASH, fibrosis) across the various analysis they performed.

      We thank this reviewer of highlighting the novelty of our manuscript. We agree that mixing generic and granular terms can be confusing and we tried to use of terms consistently throughout, which has been further improved in the revised version.

      Figure 1 and Table 1 provide comprehensive information regarding histological phenotypes, NAS scores, and patient characteristics. From Figure 2 onward, we specifically focused on steatosis and fibrosis as distinct histological features, identifying molecular signatures associated with each process.

      The term ‘MASH’ was used only when referring to the ex vivo 3D spheroids derived from histologically confirmed MASH patients for validation purposes. As our primary cohort represents early disease stages, we did not characterize molecular features of MASH in that data set.

      In this cohort, the term 'NAS' was mentioned only in Section 1 to characterize the disease spectrum. Additionally, in Figures 3A and 6A, we illustrated the association between gene expression levels and NAS in two external cohorts. This was due to the absence of steatosis grades in the two datasets. NAS is an additive measure of multiple scores (steatosis, inflammation and ballooning), but does not account for fibrosis grades.

      Our study focuses on the molecular features of steatosis grades and fibrosis grades as the main histological processes, with all terminology aligned with this stated objective. This allows us to map the transcriptome and metabolome to pathologist-defined steatosis/fibrosis severity (i.e., 0,1,2,3) and identify genes/metabolites that are correlated with increasing steatosis/fibrosis score.

      Major comments:

      • Are the key conclusions convincing?

      The conclusions are generally consistent with findings from numerous previous studies, as many of the genes identified and their associations with disease states have been previously reported. However, I found it difficult to discern which specific disease stages the authors are referring to throughout the manuscript. Terms such as MASLD (Fig. 1F), steatosis (Fig. 4A), MASH, fibrosis (Fig. 6), and the composite NAS score (Fig. 1G) are used interchangeably, without clearly explaining whether or how the patient cohort was stratified to distinguish between isolated steatosis, MASH, and MASH with or without fibrosis. It is also unclear whether subgroups were propensity score-matched.

      As explained in our previous point, we believe that we did not carelessly use the terms interchangeably, but rather used them as they were available or pertinent to the comparisons in discussion. We have provided a comprehensive cohort description in the first section (Table 1, including all histological features and NAS scores), then focused specifically on steatosis and fibrosis in subsequent analyses. We identified distinct molecular processes underlying these two histological features and validated key fibrosis-related pathways.

      Regarding the comment of ‘propensity score-matched subgroups’, we would like to clarify that the only “sub”-group analysis performed in this paper is the transition from steatosis to steatosis with fibrosis. We have consistently used linear regression as the association analysis framework, without binarization of outcomes. We recall that this is a cross-sectional study with challenging recruitment situation from a bariatric surgery clinic that naturally represents the spectrum of MASLD in obesity. We acknowledge that the sampling can always be biased in such a study. However, given the invasiveness of liver resection, the study is also limited by the reality that not all patients would agree to the study, nor it is feasible to form a perfect subgroup meeting 1:1 ratio as in large-scale epidemiology studies based on plasma samples.

      In a related point, the authors mention that 76% of patients are non-fibrotic, introducing a marked imbalance between fibrotic (n=26) and non-fibrotic (n=83) samples. Given this disparity and potential inter-individual variability, it would be helpful to include observed fold changes or effect sizes to give readers a sense of the magnitude of the biological dysregulations being reported.

      As explained in our previous response, our study design examines associations between histological and molecular features rather than using a case-control approach. For effect size quantification, we report standardized linear regression coefficients, i.e. the change in gene expression Z-score per one-point increase in steatosis or fibrosis grade. We also provided fold changes in our comparative analysis of steatosis+fibrosis versus fibrosis-free steatosis. These effect sizes were fully documented in the Supplemental Tables.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      • The authors seem pretty enthusiastic about elafibranor, despite a failed phase 3 clinical trial. I would qualify elafibranor as a useful tool in preclinical model. We agree with the reviewer and indeed used elafibranor as a research tool for PPARa/d modulation rather than a clinically promising prospect. Discussion regarding elafibranor has been updated.

      • The authors should make clearly the pronounced sex bias in their study, which includes mostly women (and btw refer to sex and not gender in the manuscript). Thank you for this important point. We added "Notably, the cohort was predominantly female (76.1%)" to the 'Overview of the study' section in the manuscript. We also replaced all 'gender' with 'sex' throughout the manuscript. In this cohorts, individuals with previous gender reassignment were excluded (see Materials and Methods).

      • The "MASH" status of the spheroid model is overstated. As described in the text it is much closer to a lipotoxicity model (and even glucotoxicity as Glc concentration is 2g/L). The 3D cultures were established from cells isolated from patients with histologically confirmed MASH. Besides steatosis, we observe increased secretion of pro-inflammatory cytokines, activation of hepatic stellate cells and increased deposition of collagen, thus phenocopying the critical disease hallmarks. Additionally, unbiased omics profiling (transcriptomics, proteomics and lipidomics) reveals significant increases in collagen biosynthesis, inflammatory signaling and cholesterol biosynthesis in MASH patient-derived cultures compared to controls. These differences largely overlapped with the results from analyses of six MASH case-control cohort studies. All of these results have been published previously (PMID 39605182).

      This is confusing with panel D in which the authors establish a relationship between fibrotic patients (F2/F3 vs F0/S0, so I guess "no MASLD liver?) and this model. Is the relationship maintained for steatotic-only patients?

      In Figure 6D, we compared GTPase-related gene expression between patients with fibrosis grade 2/3 (n = 26) and those without fibrosis and steatosis (n = 24). Principal component regression resulted in a positive correlation (β = 9.97) between log2 fold changes in 3D spheroids and human fibrosis samples, indicating consistent directional changes in both systems.

      To answer the question from the reviewer, we compared the expression levels of GTPase-related genes in patients with steatosis but no fibrosis (n = 18) to those without fibrosis and steatosis (n = 24), we observed a negative correlation (β = -10.91). This indicates that GTPase-related gene changes in our 3D spheroids do not align with steatosis-related changes in humans.

      Therefore, under the assumption that fibrosis follows steatosis in the majority of the cases of MASLD progression, the result indicates that the alterations in GTPase-related gene expression in the 3D spheroid model specifically is reflective of fibrosis rather than steatosis.

      Figure R2. Comparison of expression level changes in GTPase-related genes between this human cohort and an independent 3D spheroid system: (A) positive correlation with fibrosis grade 2/3 patients versus controls (left), and (B) negative correlation with steatosis-only patients versus controls (right).

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I am not convinced that HSC and LX2 cells express significant levels of PPARα. However, did the authors check for this parameter in their LX2 cell line and assessed whether PPARα/b activation by elafibranor (and/or pemafibrate as it is PPARα selective) alter GTPase expression? Whether negative or positive, this could give a clue about possible intercellular crosstalk in the spheroid model.

      We thank this reviewer to point this out. In response, we analysed the mRNA expression of all PPARs in LX-2 cells with and without Elafibranor treatment, respectively (see Figure R3, same as Figure S8G in the Supplemental Material). We confirmed PPARs are expressed in LX-2 cells at the mRNA level (Figure R3A). Elafibranor does not affect their mRNA levels, which is consistent with previous reports that its primary mechanism is through binding and altering the activity of PPAR proteins, not gene expression (PMID 33326461 and PMID 37627519).

      *Figure R3. Gene signatures in LX-2 cells with and without Elafibranor treatment (n = 3). *

      In addition, we assessed mRNA levels of selected GTPase-related genes in LX-2 cells with and without Elafibranor treatment (Figure R3B). Although statistical power was limited, we observed a consistent trend toward reduced RHOU, DOCK2, and RAC1 expression with Elafibranor. this preliminary signal suggests that Elafibranor may counter the elevated GTPase levels seen in MASH patient spheroids, potentially via crosstalk among hepatic cell types, including HSCs.

      To further investigate intercellular crosstalk in GTPase regulation among hepatic cell types, we evaluated signature GTPase-related genes in LX-2 cells, spheroid co-cultures (hepatocytes, HSCs, Kupffer cells), and hepatocyte monocultures. As shown in Figure R4 (same as Figure S10 in the supplemental material), TGFB1 served as a positive control, exhibiting the most pronounced induction upon TGF-β1 treatment in hepatocytes. Despite varied alterations across the selected GTPase-related genes, TGF-β1 treatment produced a trend toward increased VAV1 and DOCK2 expression in co-culture, hepatocytes, and LX-2 cells, and this was reversed by the TGF-β inhibitor in co-culture and hepatocytes. Other GTPase genes, including RAC1, RAB32, and RHOU, displayed cell type–specific responses to TGF-β1. These observations suggest that the regulation of GTPases is mediated by multiple hepatic cell types, supporting the importance of intercellular crosstalk.

      Figure R4. Expression of GTPase-related genes in spheroid co-culture, hepatocyte monoculture, and LX-2 cells (n = 3). Controls for each gene and experiment were normalized to 1 to enable comparison across treatment groups.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The experiment mentioned above is cheap (cell culture, RT-QPCR) and can be performed within a couple of weeks.

      • Are the data and the methods presented in such a way that they can be reproduced? Yes

      • Are the experiments adequately replicated and statistical analysis adequate? There is no indication of group size, number of replicates for in vitro experiments

      Thank you for this suggestion. We have added the sample sizes to all relevant sections: ‘n = 4’ in the figure legends for 3D spheroid experiments and ‘n = 8–10’ for the LX-2 experiments. This information has also been incorporated into the corresponding experimental descriptions in the Methods section.

      **Referees cross-commenting**

      I believe there is a general consensus on this potentially interesting contribution to the field, with three main points: (1) the need for a careful group-by-group comparison that accounts for potential confounders, (2) a more rigorous exploitation/characterization of the spheroid system, and (3) the need to benchmark the authors' findings against the available literature.

      Thank you for summarizing the main points. Our responses are as follows:

      • We adjusted for key confounders (sex, gender, age, BMI, diabetes) in all statistical analysis to minimize potential bias, mostly using linear regression (rather than group-to-group comparison). In response to Reviewer 3, comment 1, we also conducted additional statistical analyses exploring molecular changes in diabetic vs. non-diabetic individuals.
      • We provided detailed characterization of the spheroid model (response to Reviewer 3, comment 3) and we have done additional experiments in LX-2 cells.
      • We benchmarked our findings using external human cohorts, mouse models, and single cell spheroid systems:
      • Compared our liver transcriptomics data with two published liver RNA-seq datasets (EU cohort, PMID 31467298; VA cohort, PMID 33268509) as shown in Figure 1G. In Figures 3A and 6A, we also included sidebars indicating gene alterations in these cohorts, showing consistent trends. Moreover, we examined the expression alterations of GTPase-related genes in these datasets in response to Reviewer 3’s comment 2.
      • Assessed genes linked to fibrosis progression in hepatic stellate cells from a murine liver fibrosis model (PMID 34839349), confirming differential expression of GTPases and their regulators during fibrosis initiation (Figure S9A).
      • Examined GTPase-related genes in an independent single-cell human spheroid system (PMID 37962490). This enabled cell-type-specific information of GTPase regulation in response to TGF-β (Figure S9C). We also expanded the discussion section on both the consistencies and discrepancies between our findings and previously published studies.

      Reviewer #2 (Significance (Required)):

      The authors identified GTPases as players in the progression of MASLD. This is an interesting preliminary report warranting further molecular investigations (in which liver cell types, which GTPase pathway(s) are involved, which functions are controlled through this pathway...)

      • State what audience might be interested in and influenced by the reported findings.

      This paper will have an impact in the hepatology field

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I have expertise in the analysis of "MASLD" human cohorts and in the molecular biology of chronic liver diseases.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Metabolic dysfunction associated liver disease (MASLD) describes a spectrum of progressive liver pathologies linked to life style-associated metabolic alterations (such as increased body weight and elevated blood sugar levels), reaching from steatosis over steatohepatitis to fibrosis and finally end stage complications, such as liver failure and hepatocellular carcinoma. Treatment options for MASLD include diet adjustments, weight loss, and the receptor-β (THR-β) agonist resmetirom, but remain limited at this stage, motivating further studies to elucidate molecular disease mechanisms to identify novel therapeutic targets. In their present study, the authors aim to identify early molecular changes in MASLD linked to obesity. To this end, they study a cohort of 109 obese individuals with no or early-stage MASLD combining measurements from two anatomic sides: 1. bulk RNA-sequencing and metabolomics of liver biopsies, and 2. metabolomics from patient blood. Their major finding is that GTPase-related genes are transcriptionally altered in livers of individuals with steatosis with fibrosis compared to steatosis without fibrosis.

      Major comments:

      1. Confounders (such as (pre-)diabetes) The patient table shows significant differences in non-MASLD vs. MASLD individuals, with the latter suffering more often from diabetes or hypertriglyceridemia.

      Rather than just stating corrections, subgroup analyses should be performed (accompanied with designated statistical power analyses) to infer the degree to which these conditions contribute to the observations. I.e., major findings stating MASLD-associated changes should hold true in the subgroup of MASLD patients without diabetes/of female sex and so forth (testing for each of the significant differences between groups).

      Our original statistical analysis employed linear regression to examine associations between molecular variables (genes/metabolites) and histological progression (steatosis and fibrosis), with adjustment for potential confounders including diabetic status, age, sex, and BMI. We specifically focused on these two histological features to elucidate the disturbed molecular processes during their progression. Regression coefficients represent the expected change in abundance levels (in units of standard deviation of the corresponding molecule) per one-unit increase in histological grades.

      To address the reviewer's question, we conducted additional subgroup analyses to determine whether our major findings remain consistent in individuals with and without diabetes. We assessed linear associations between gene signatures and histological features separately in non-diabetic (n = 71) and diabetic individuals (n = 23). Statistical power was estimated by comparing the variance explained by the full regression model (y ~ x + a + b + c) against the reduced model (y ~ a + b + c), converting the incremental for x into Cohen's , and applying pwr.f2.test with the corresponding degrees of freedom and sample size at α = 0.05.

      For both steatosis and fibrosis, the results in the non-diabetic subgroup (n = 71) showed high consistency with findings in our original analysis (n = 94, adjusted for diabetes), indicating that our originally reported gene signatures, after correction for diabetic status, remain valid in non-diabetic individuals.

      In contrast, for diabetic individuals (n = 23), associations between genes and histological features did not closely replicate our original findings. Notably, we observed larger estimate effects for fibrosis-associated genes in diabetic individuals, suggesting a potential interaction between diabetes and fibrosis progression.

      Figure R5. Subgroup analysis of the association between gene expressions and steatosis grades

      Figure R6. Subgroup analysis of the association between gene expressions and fibrosis grades

      On the comment "degree to which these conditions contribute to the observations," our original analysis adjusted for diabetes status to identify molecular signatures independently associated with fibrosis without the confounding of diabetes status. Consequently, the reported gene signatures in the original analysis more closely reflect patterns in the non-diabetes group, as demonstrated in our subgroup analysis plots. We also comment that, unfortunately, we did not adjust for the interaction of fibrosis and diabetes in the original analysis.

      Furthermore, our additional analyses revealed a close relationship between diabetes and liver fibrosis. Consistent with Figure 1C, hepatic fibrosis is significantly correlated with insulin resistance parameters in clinical assays, including blood insulin levels and HOMA2-IR. To explore this association further, we compared gene expression profiles between diabetic MASLD patients (n = 21) and non-diabetic MASLD patients (n = 43). Although few genes reached significance after multiple testing correction, 166 genes showed differential expression (p 0.32) between these groups.

      We identified 55 genes as potential "diabetic markers" that both showed differential expression between diabetic and non-diabetic MASLD patients and were significantly associated with steatosis or fibrosis progression. These genes are predominantly downregulated metabolic genes (e.g., BAAT, G6PC1, SULT2A1, MAT1A), suggesting that diabetes may exacerbate metabolic suppression as fibrosis advances. Given the high prevalence of diabetes in the MASLD population, our analysis supports the hypothesis that diabetes worsens MASLD outcomes, likely through impaired metabolic capability during fibrosis progression.

      Regarding the comment on the "subgroup of female sex," our original analysis also adjusted for sex as a potential confounder. Since our cohort is predominantly female (>76%), the majority of our findings likely holds true in the female sub-population, similar to what we observed in our diabetes subgroup analysis.

      External validation

      Additionally, to back up the major GTPase signature findings, it would be desirable to analyze an external dataset of (pre)diabetes patients (other biased groups) for alternations in these genes. It would be important to know if this signature also shows in non-MASLD diabetic patients vs. healthy patients or is a feature specific to MASLD. Also, could the matched metabolic data be used to validate metabolite alterations that would be expected under GTPase-associated protein dysregulation?

      We appreciate the comments regarding the validation GTPase as a unique MASLD signature by external datasets. As shown in our previous analysis, after adjusting for diabetes status, the gene signatures remained largely preserved in the non-diabetes subgroup. Before we respond further, we also preface that publicly available liver tissue data, with appropriate and full-scale clinical metadata and sufficient sample sizes, are extremely rare. To the best of our knowledge, the public data sets we brought into our paper were the most prominent data of reliable quality.

      In the paper, we benchmarked our RNAseq dataset against two datasets: the VA cohort and EU cohort (Figure 1). Our cohort focused primarily on early MASLD patients with obesity, which aligns more closely with the disease spectrum represented in the VA cohort (Figure 1G). Notably, in the published paper for the VA cohort, Hoang et al. highlighted Rho GTPase signaling as one of the top pathways in the fibrosis PPI network (Figure 1B from publication PMID 31467298).

      We interrogated GTPase-related genes in both the VA and EU cohorts. As shown in Figure R7 (below), GTPase-related genes demonstrated a strong association with fibrosis grades in the VA cohort, as expected. The EU cohort comprises more advanced MASLD cases with higher fibrosis grades, and our re-analysis in this cohort specifically focused on MASH patients (as designated by the authors). In those MASH patients, GTPase-related genes did not show significant positive associations with fibrosis progression. This finding is consistent with our hypothesis that GTPase regulation is triggered more prominent during the early progression of fibrosis rather than at later stages.

      Unfortunately, diabetes status was not available in the GEO repository for the VA cohort. Available liver tissue sequencing datasets with balanced representation of diabetic and nondiabetic patients are rare, especially those derived from obese individuals and reflecting the early-to-middle stages of MASLD. In our own cohort, for instance, only two diabetic patients without MASLD were recruited (Table 1). While we cannot rule out a role for insulin resistance in GTPase regulation, we will plan future experiments using mouse models to examine GTPase-mediated fibrosis under diabetic and nondiabetic conditions.

      Regarding the comment ‘validate metabolite alterations that would be expected under GTPase-associated protein dysregulation,’ we note that GTPases are primarily involved in cytoskeletal organization, vesicle trafficking, and other cellular processes, with few well-established links to specific metabolite signatures. Nevertheless, in our partial correlation network integrating hepatic genes and metabolites, we observed co-regulated metabolites associated with GTPase-related genes (Figure R8). These included palmitoleoyl ethanolamide (N-acylethanolamine, an anti-inflammatory metabolite and PPARα ligand), phenylacetic acid (a phenylalanine metabolite), biotin (a coenzyme), arginine, lysine, melatonin (a tryptophan metabolite), and several lipid species such as PC 32:0 and CAR 20:1. While causal relationships cannot be inferred from this dataset, our integrative network highlights potential connections related to the trafficking of these metabolites that warrant further investigation.

      Figure R7. Associations between GTPase-related genes to fibrosis in this study and two external cohorts. Asterisks denote significant associations with q value Figure R8. Integrative subnetwork of GTPase-related genes. Blue squares represent GTPase-related genes, red circles indicate metabolites connected to these genes, and the purple diamond denotes fibrosis, which is connected to RHOU.*

      3D liver spheroid MASH model, Fig. 6D/E

      This 3D experiment is technically not an external validation of GTPase-related genes being involved in MASLD, since patient-derived cells may only retain changes that have happened in vivo. To demonstrate that the GTPase expression signature is specifically invoked by fibrosis the LX-2 set up is more convincing, however, the up-regulation of the GTPase-related genes upon fibrosis induction with TGF-beta, in concordance with the patient data, needs to be shown first (qPCR or RNA-seq).

      We agree with the reviewer that experiments in LX-2 (HSC) cells are important and as we have described under ‘Reviewer #2’ we have done this (Figure R3 and Figure R4). Because HSCs only comprise a minor cell population of liver cells, the signals observed in patient bulk RNA data are likely driven primarily by hepatocytes. Nevertheless, we have highlighted the importance of hepatic cell crosstalk in Figure R4 and in our response to Reviewer #2. Additionally, in Supplementary Figure S9B, we identify the potential cell types of origin for the GTPase signals (predominantly hepatocytes and HSCs) using a single-cell dataset from an independent study (PMID: 37962490).

      Additionally, the description of the 3D model is too uncritical. The maintenance of functional human PHHs in 3D has only become available this year (PMID: 40240606) marking a break-through in the field. Since the authors did not use this system, I would strongly assume their findings are largely attributable to the mesenchymal cells in the 3D culture, and these limitations need to be stated.

      We humbly disagree with the reviewer on the 3D liver spheroids. The paper that the reviewer is referencing is related to the proliferation of hepatocytes in organoids, not – at least not directly – their functional maintenance. Here, we use a spheroid model of mature fully differentiated cells, which is conceptually different from the organoid approach. Maintenance of such functional human hepatocytes for multiple weeks in culture has been possible for close to a decade (PMID 27143246). Moreover, particularly for the modeling of chronic liver disease, such as MASH, it is important to use directly patient-derived cells as short induction cycles (typically 1-2 weeks) of disease phenotypes in organoid models do not faithfully reproduce the molecular signatures that stem from chronic exposures in vivo.

      The 3D liver spheroid model we used here is derived from livers from patients with a histologically confirmed diagnosis of MASH. The isolated cells are fully mature and thus do not require in vitro differentiation. There are no MSCs in the 3D cultures; rather the spheroids contain hepatocytes, stellate cells, Kupffer cells as well as various other immune cell types present in the liver at the time of isolation (T cells, B cells, NK cells). Furthermore, the model is extensively characterized at the transcriptomic, proteomic and lipidomic level (PMID 39605182).

      Novelty / references

      Similar studies that also combined liver and blood lipidomics/metabolomics in obese individuals with and without MASLD (e.g. PMID 39731853, 39653777) should be cited. Additionally, it would benefit the quality of the discussion to state how findings in this study add new insights over previous studies, if their findings/insights differ, and if so, why.

      Thank you for the suggestion. We added the two papers into the discussion section. Specifically, we discussed the consistent findings (such as AKR1B10 in PMID 37037945 and mitochondrial dysfunction in PMID 39731853) and discrepancies (such as limited plasma metabolomic changes and circulating sphingolipid alterations in multiple human and mouse models) in comparison with previously published omics studies in MASLD patients. Also, we thoroughly discussed our findings (e.g., lipid dysregulation, dysregulated tryptophan metabolism, GTPase regulation) and potential mechanisms with extensive literature supports from of human, animal, and cell studies.

      Minor comments:

      1. The quality of Supplementary Figures (e.g. S7) makes is impossible to read the labels Thank you for this feedback. The resolution of the figures was impaired in the initial upload. We will provide all supplementary figures with high resolution in our revised submission and ensure all labels are clearly readable.

      For Figure S7C, we presented the correlation matrix of more than 200 GTPase-related genes along with the TGF-β genes TGFB1 and TGFB3. This illustrates the overall co-expression patterns of GTPase-related genes rather than displaying individual gene labels, with arrows now included to highlight TGFB1 and TGFB3.

      Reviewer #3 (Significance (Required)):

      The authors provide an overall sound study on the hepatic transcriptomic and metabolomic signatures in an Australian cohort of 109 obese non-to-early stage MASLD patients. They perform thorough analyses of metabolome and transcriptome in liver biopsies and metabolome in blood, using standard technologies such as RNA sequencing and mass spectrometry. Their key finding is a GTPase-associated gene signature related to fibrosis onset. Limitations of the study include potential cohort confounders (raising the need for expanded control experiments), limited discussion of similar studies, and limits in cell-type resolution, the latter of which is related to the molecular read out, and has in parts been started to be addressed by in vitro experiments in an immortalized HSC lines. Taken together, given additional control analyses will be performed, the results could be of interest to an expert community in the field of molecular hepatology and, while still descriptive, hold the potential to prompt mechanistic follow-up studies.

      We thank this reviewer for a balanced, positive, and constructive evaluation of our manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Metabolic dysfunction associated liver disease (MASLD) describes a spectrum of progressive liver pathologies linked to life style-associated metabolic alterations (such as increased body weight and elevated blood sugar levels), reaching from steatosis over steatohepatitis to fibrosis and finally end stage complications, such as liver failure and hepatocellular carcinoma. Treatment options for MASLD include diet adjustments, weight loss, and the receptor-β (THR-β) agonist resmetirom, but remain limited at this stage, motivating further studies to elucidate molecular disease mechanisms to identify novel therapeutic targets.

      In their present study, the authors aim to identify early molecular changes in MASLD linked to obesity. To this end, they study a cohort of 109 obese individuals with no or early-stage MASLD combining measurements from two anatomic sides: 1. bulk RNA-sequencing and metabolomics of liver biopsies, and 2. metabolomics from patient blood. Their major finding is that GTPase-related genes are transcriptionally altered in livers of individuals with steatosis with fibrosis compared to steatosis without fibrosis.

      Major comments:

      1. Confounders (such as (pre-)diabetes) The patient table shows significant differences in non-MASLD vs. MASLD individuals, with the latter suffering more often from diabetes or hypertriglyceridemia. Rather than just stating corrections, subgroup analyses should be performed (accompanied with designated statistical power analyses) to infer the degree to which these conditions contribute to the observations. I.e., major findings stating MASLD-associated changes should hold true in the subgroup of MASLD patients without diabetes/of female sex and so forth (testing for each of the significant differences between groups).
      2. External validation Additionally, to back up the major GTPase signature findings, it would be desirable to analyze an external dataset of (pre)diabetes patients (other biased groups) for alternations in these genes. It would be important to know if this signature also shows in non-MASLD diabetic patients vs. healthy patients or is a feature specific to MASLD. Also, could the matched metabolic data be used to validate metabolite alterations that would be expected under GTPase-associated protein dysregulation?
      3. 3D liver spheroid MASH model, Fig. 6D/E This 3D experiment is technically not an external validation of GTPase-related genes being involved in MASLD, since patient-derived cells may only retain changes that have happened in vivo. To demonstrate that the GTPase expression signature is specifically invoked by fibrosis the LX-2 set up is more convincing, however, the up-regulation of the GTPase-related genes upon fibrosis induction with TGF-beta, in concordance with the patient data, needs to be shown first (qPCR or RNA-seq). Additionally, the description of the 3D model is too uncritical. The maintenance of functional human PHHs in 3D has only become available this year (PMID: 40240606) marking a break-through in the field. Since the authors did not use this system, I would strongly assume their findings are largely attributable to the mesenchymal cells in the 3D culture, and these limitations need to be stated.
      4. Novelty / references Similar studies that also combined liver and blood lipidomics/metabolomics in obese individuals with and without MASLD (e.g. PMID 39731853, 39653777) should becited. Additionally, it would benefit the quality of the discussion to state how findings in this study add new insights over previous studies, if their findings/insights differ, and if so, why.

      Minor comments:

      1. The quality of Supplementary Figures (e.g. S7) makes is impossible to read the labels

      Significance

      The authors provide an overall sound study on the hepatic transcriptomic and metabolomic signatures in an Australian cohort of 109 obese non-to-early stage MASLD patients. They perform thorough analyses of metabolome and transcriptome in liver biopsies and metabolome in blood, using standard technologies such as RNA-sequencing and mass spectrometry. Their key finding is a GTPase-associated gene signature related to fibrosis onset. Limitations of the study include potential cohort confounders (raising the need for expanded control experiments), limited discussion of similar studies, and limits in cell-type resolution, the latter of which is related to the molecular read out, and has in parts been started to be addressed by in vitro experiments in an immortalized HSC lines. Taken together, given additional control analyses will be performed, the results could be of interest to an expert community in the field of molecular hepatology and, while still descriptive, hold the potential to prompt mechanistic follow-up studies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this paper, Kaldis and collaborators investigate the molecular heterogeneity of a 109 morbidly obese patient cohort, focusing on liver transcriptomics and metabolomics analysis from liver and serum. The main finding (ie upregulation of GTPase-coding genes) was validated in spheroids and a human HSC cell line. As these proteins are involved in critical cellular functions related to metabolism and cytoskeleton dynamics, these findings shed light on their involvement in human liver pathology which so far has been poorly (or even not) documented to date. This is an interesting addition to the current knowledge about chronic liver pathology. However the manuscript suffers from the lack of a clear-cut definition of patient subgroups and the seemingly indistinct use of generic (MASLD, NAS score) and more granular terms (MASH, fibrosis) across the various analysis they performed.

      Major comments:

      • Are the key conclusions convincing?

      The conclusions are generally consistent with findings from numerous previous studies, as many of the genes identified and their associations with disease states have been previously reported. However, I found it difficult to discern which specific disease stages the authors are referring to throughout the manuscript. Terms such as MASLD (Fig. 1F), steatosis (Fig. 4A), MASH, fibrosis (Fig. 6), and the composite NAS score (Fig. 1G) are used interchangeably, without clearly explaining whether or how the patient cohort was stratified to distinguish between isolated steatosis, MASH, and MASH with or without fibrosis. It is also unclear whether subgroups were propensity score-matched.

      In a related point, the authors mention that 76% of patients are non-fibrotic, introducing a marked imbalance between fibrotic (n=26) and non-fibrotic (n=83) samples. Given this disparity and potential inter-individual variability, it would be helpful to include observed fold changes or effect sizes to give readers a sense of the magnitude of the biological dysregulations being reported. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? - The authors seem pretty enthusiastic about elafibranor, despite a failed phase 3 clinical trial. I would qualify elafibranor as a useful tool in preclinical model.<br /> - The authors should make clearly the pronounced sex bias in their study, which includes mostly women (and btw refer to sex and not gender in the manuscript).<br /> - The "MASH" status of the spheroid model is overstated. As described in the text it is much closer to a lipotoxicity model (and even glucotoxicity as Glc concentration is 2g/L). This is confusing with panel D in which the authors establish a relationship between fibrotic patients (F2/F3 vs F0/S0, so I guess "no MASLD liver?) and this model. Is the relationship maintained for steatotic-only patients?<br /> - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. I am not convinced that HSC and LX2 cells express significant levels of PPARα. However, did the authors check for this parameter in their LX2 cell line and assessed whether PPARα/b activation by elafibranor (and/or pemafibrate as it is PPARα selective) alter GTPase expression? Whether negative or positive, this could give a clue about possible intercellular crosstalk in the spheroid model.<br /> - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The experiment mentioned above is cheap (cell culture, RT-QPCR) and can be performed within a couple of weeks. - Are the data and the methods presented in such a way that they can be reproduced?

      Yes - Are the experiments adequately replicated and statistical analysis adequate?

      There is no indication of group size, number of replicates for in vitro experiments

      Referees cross-commenting

      I believe there is a general consensus on this potentially interesting contribution to the field, with three main points: (1) the need for a careful group-by-group comparison that accounts for potential confounders, (2) a more rigorous exploitation/characterization of the spheroid system, and (3) the need to benchmark the authors' findings against the available literature.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The authors identified GTPases as players in the progression of MASLD. This is an interesting preliminary report warranting further molecular investigations (in which liver cell types, which GTPase pathway(s) are involved, which functions are controlled through this pathway...)
      • State what audience might be interested in and influenced by the reported findings. This paper will have an impact in the hepatology field
      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. I have expertise in the analysis of "MASLD" human cohorts and in the molecular biology of chronic liver diseases.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Metabolic dysfunction-associated steatotic liver disease (MASLD) ranges from simple steatosis, steatohepatitis, fibrosis/cirrhosis, and hepatocellular carcinoma. In the current study, the authors aimed to determine the early molecular signatures differentiating patients with MASLD associated fibrosis from those patients with early MASLD but no symptoms. The authors recruited 109 obese individuals before bariatric surgery. They separated the cohorts as no MASLD (without histological abnormalities) and MASLD. The liver samples were then subjected to transcriptomic and metabolomic analysis. The serum samples were subjected to metabolomic analysis. The authors identified dysregulated lipid metabolism, including glyceride lipids, in the liver samples of MASLD patients compared to the no MASLD ones. Circulating metabolomic changes in lipid profiles slightly correlated with MASLD, possibly due to the no MASLD samples derived from obese patients. Several genes involved in lipid droplet formation were also found elevated in MASLD patients. Besides, elevated levels of amino acids, which are possibly related to collagen synthesis, were observed in MASLD patients. Several antioxidant metabolites were increased in MASLD patients. Furthermore, dysregulated genes involved in mitochondrial function and autophagy were identified in MASLD patients, likely linking oxidative stress to MASLD progression. The authors then determined the representative gene signatures in the development of fibrosis by comparing this cohort with the other two published cohorts. Top enriched pathways in fibrotic patients included GTPas signaling and innate immune responses, suggesting the involvement of GTPas in MASLD progression to fibrosis. The authors then challenged human patient derived 3D spheroid system with a dual PPARa/d agonist and found that this treatment restored the expression levels of GTPase-related genes in MASLD 3D spheroids. In conclusion, the authors suggested the involvement of upregulated GTPase-related genes during fibrosis initiation. Overall, the current study might provide some resources regarding transcriptomic and metabolomic data derived from obese patients with and without MASLD. However, several concerns should be carefully addressed.

      1. A recent study, via proteomic and transcriptomic analysis, revealed that four proteins (ADAMTSL2, AKR1B10, CFHR4 and TREM2) could be used to identify MASLD patients at risk of steatohepatitis (PMID: 37037945). It is not clear why the authors did not include this study in their comparison.
      2. The authors recruited 109 patients but only performed transcriptomic and metabolomic analysis in 94 liver samples. Why did the authors exclude other samples?
      3. The authors mentioned clinical data in Table 1 but did not present the table in this manuscript.
      4. The generated metabolomic data could be a very useful resource to the MASLD community. However, it is very confusing how the data was generated in those supplemental tables. There is no clear labeling of human clinical information in those tables. Also, what do those values mean in columns 47-154? This reviewer assumed that they are the raw data of metabolomic analysis in plasma samples. However, without clear clinical information in these patients, it is impossible that any scientist can use the data to reproduce the authors' findings.
      5. In Fig. 5B, the authors excluded the steatosis and fibrosis overlapped genes. Steatosis and fibrosis specific genes could simply reflect the outcomes rather than causes. In this case, the obtained results might not identify the gene signatures related to fibrosis initiation.
      6. In Fig. 6D, the authors used 3D liver spheroid to validate their findings. However, there is no images showing the 3D liver spheroid formation before and after PPARa/d agonist treatment. It is not clear whether the 3D liver spheroid was successfully established.
      7. The authors suggested that targeting LX-2 cells with Rac1 and Cdc42 inhibitors could reduce collagen production. Did the authors observe these two genes upregulated in mRNA and protein expression levels in their cohort when compared MASLD patients with and without fibrosis?
      8. Did the authors observe that the expression levels of Rac1 and Cdc42 are correlated with fibrosis progression in MASLD patients?
      9. Other studies have revealed several metabolite changes related to MASLD progression (PMID: 35434590, PMID: 22364559). However, the authors did not discuss the discrepancies between their findings with the previous studies.

      Significance

      Overall, the current study might provide some new resources regarding transcriptomic and metabolomic data derived from obese patients with and without MASLD. The MASLD research community will be interested in the resource data.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Xu et al. investigated split gene drive systems by targeting multiple female essential genes involved in fertility and viability in Drosophila. The authors evaluate the suppression efficiency through individual corsses and cage trials. Resistance allele formation and fitness costs are explored by examining the sterility and fertility of each line. Overall, the experimental design is sound and methods are feasible. The work is comprehensive, and conclusions are well supported by the data. This work offers informative insights that could guide the design of suppression gene drive systems in other invasive disease vectors or agricultural pests.

      However, several points requiring clarification or improvement:

      1 Methodological clarity: Some experimental details are indufficiently described, for example, regarding the setup of genetic crosses involving different Cas9 derivatives. In line 197-198, "the mated females, together with females that were mated with Cas9 only males", it is unclear whether the latter group refers to gRNA-females.

      -We thank the reviewer for pointing out this ambiguity. The latter group refers to Cas9 females crossed to Cas9 males. We have clarified this both in the methods (line 207) and results (line 505-509).

      2.Regarding the inheritance rates, you included the reverse orientation of CG4415-Cas9, as I understood, it means this component is in reverse orientation with fluorescent marker. Since it is standard to design adjacent components in opposite direction to avoid transcriptional interference, the rationale for including this comparison should be better justified.

      • In our construct, ‘CG4415 (reverse orientation)’ indicates that Cas9 was oriented in the same direction as the fluorescent marker, while the other Cas9 constructs (nanos-Cas9 and CG4415-Cas9) places them in opposite directions. “reverse” just indicates a change from a “standard” in another study. Our previous publication showed that Cas9 orientation relative to the marker had little apparent effect on drive performance at the yellow-G locus. In this study, we compared both orientations in a fertility gene and again observed similar results, suggesting that orientation relative to the marker does not substantially affect drive efficiency in our system. We have clarified this in the figure legend text.

      Embryo resistance is inferred from the percentage of sterile drive females derived from drive mothers. How many female individuals were analysed per line and why deep sequencing was not employed to directly detect resistance alleles.

      -Embryo resistance can mean slightly different things for different applications. The most important is probably the fraction of females that have little to no fertility due to embryo resistance. Some of these may not have complete embryo resistance alleles, but instead, have mosaicism, with a sufficient level of resistance to still cause sterility. It is unclear exactly what proportion of resistance to wild-type may cause this, and thus, proportions from pooled sequencing, which could include both complete and all levels of mosaicism, may not be sufficient to measure this parameter. Another relevant parameter that we did not measure is the fraction of males rendered unable to do drive conversion (this value should be closer to the complete resistance rate, but probably still lower because of the multiple gRNAs). Even in this case, deep sequencing would not allow us to determine exactly what is happening in males, making individual sequencing a preferred approached. It is very nice, of course, for characterizing which resistance alleles are present overall, but in this study, we wanted to put a bit more emphasis on the effect of resistance, rather than its sequence characterizing.

      We analyzed 30 females per line for lines targeting nox, oct, dec and stl, 9 females for ndl and 276 individuals for line tra-v2 (Data Set S4). We believe such individual analyses sufficiently detected embryo resistance causing sterility within reasonable error. Note that we did also randomly genotype several sterile females and found mutations at target sites that disrupted gene functions.

      In response to this comment, we have added some text to justify our measurement of resistance alleles and include some of this discussion:

      “Note also that this defines embryo resistance as sufficient to induce sterility, but these may be mosaic rather than complete resistance. Further, note that the multiplex gRNA design in males may allow for continued drive conversion with a complete (as opposed to mosaic) embryo resistance allele, if some sites remain wild-type.”

      Masculinisation phenotypes were observed upon disruption of tra gene. How strong intersexes were distinguished from males? What molecular markers were used to determine genetic sex. This information should be clearly provided.

      -We observed two types of strong masculinisation phenotypes (Figure S2), one with bigger body size than wildtype males, and the other was identical to wildtype males. The homozygosity of the drive allele could be assessed by the brightness of red fluorescence in the eyes. However, we also randomly genotyped these masculinized females (as part of a batch that included males) to confirm their sex using primers for the Y-linked gene PP1Y2. A specific band was detected in wild-type males but not in masculinized females, confirming their genetic sex. This information has been added to the manuscript (lines 477-480).

      It would be more appropriate to use "hatchability"rather than "fertility" when referring to egg-to-larva viability.

      -Thank you for the suggestion. We used egg-to-adult survival rates as a proxy for the fertility of their parents because they usually laid similar number of eggs. However, it still technically incorrect language. We have fixed this in line 582 and elsewhere in the section.

      In cage trials, a complete gene drive is mimicked by introducing Cas9 to the background population, but this differs from actual complete gene drive, due to potential effects from separate insertion sites (different chromosome or loci). These difference could impact the system's performance and should be discussed.

      -We appreciate this point and have added discussion on the limitations of mimicking a complete gene drive using split components (line 766-779).

      7.Given the large amount of data presented, it would improve readability and interpretation if each result section concluded with a concise summary highlighting the key findings and implications.

      -Thank you for the suggestion. We have added brief summaries at the end of each results section to highlight the key findings and their significance.

      Reviewer #1 (Significance (Required)):

      The authors evaluate suppression efficiency through individual courses and cage trials. Resistance allele formation and fitness costs are explored by examining the sterility and fertility of each line. Overall, the experimental design is sound and methods are feasible. The work is comprehensive, and conclusions are well supported by the data. This work offers informative insights that could guide the design of suppression gene drive systems in other invasive disease vectors or agricultural pests.

      -We appreciate the reviewer’s positive assessment of our work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Paper summary

      The manuscript by Xu. et al presents an insightful and valuable contribution to the field of gene drive research. The manuscript by Xu et al. presents an insightful and valuable contribution to the field of gene drive research. The strategy of targeting and disrupting female fertility genes using selfish homing genetic elements was first proposed by Burt in 2003. However, for this approach to be effective, the phenotypic constraints associated with gene disruption have meant that the pool of suitable target genes remains relatively small - notwithstanding the significant expansion in accessible targets enabled by CRISPR-based genome editing nucleases. Population suppression gene drives are well developed as proof-of-principle systems, with some now in the late stages of development as genetic control strains. However, advancing the pipeline will require a broader set of validated target genes - both to ensure effectiveness across diverse species and to build redundancy into control strategies, reducing reliance on any single genetic target.

      In their paper, the authors conduct a systematic review of nine female fertility genes in Drosophila melanogaster to assess their potential as targets for homing-based suppression gene drives. The authors first conduct a thorough bioinformatic review to select candidate target genes before empirically testing candidates through microinjection and subsequent in vivo analyses of drive efficiency, population dynamics, and fitness costs relating to fecundity and fertility. After finalising their results, the authors identify two promising candidate target genes - oct and stl - which both demonstrate high gene conversion rates and, regarding the latter, can successfully suppress a cage population at a high release frequency. However, the manuscript suffers from a lack of in-depth discussion of a key limitation in its experimental design - namely, that the authors utilise a split-drive design to assess population dynamics and fitness effects when such a drive will not reflect release scenarios in the field. The review below highlights some major strengths and weaknesses of the paper, with suggestions for improvement.

      Key strengths

      The study's most significant strength is in its systematic selection and empirical testing of nine distinct genes as targets for homing-based gene drive, hence providing a valuable resource that substantially expands the pool of potential targets beyond the more commonly studied target genes (e.g. nudel, doublesex, among others). The identification of suitable target genes presents a significant bottleneck in the development of gene drives and the work presented here provides a foundational dataset for future research. The authors bolster the utility of their results by assessing the conservation of candidate genes across a range of pest species, suggesting the potential for broader application.

      A key finding in the paper is the successful suppression of a cage population using a stl-targeting gene drive (albeit at a high release frequency). This provides a critical proof-of-principal result demonstrating that stl is a viable target for a suppression drive. While in the paper suppression was not possible at lower release frequencies, together, the results provide evidence for complex population dynamics and threshold effects that may govern the success or failure of a gene drive release strategy - hence moving the conversation from a technical perspective ("can it work") to how a gene drive may be implemented. Moreover, the authors also employ a multiplexed gRNA strategy for all their gene drive designs and in particular their population suppressive gene drive targeting stl. This provides further proof-of-principal evidence for multiplexed gRNAs in order to combat the evolution of functional resistance following gene drive deployment.

      Finally, a further strength of this paper is in the clever dissection of fitness effects resulting from maternal Cas9 deposition. The authors design and perform a robust set of crosses to elucidate the parental source of fitness effects (i.e. maternally, paternally, or biparentally derived Cas9), finding (as they and others have before) that embryonic fitness was significantly reduced when Cas9 was inherited from a maternal source. As discussed, the authors conclude that maternal deposition is particularly pronounced in the context of split drives as opposed to complete drives, with the implication being that a complete drive might succeed where a split-drive has failed; thus providing a key directive for future study.

      Concerns

      The manuscript's central weakness lies in its interpretation of the results from the cage experiments - namely that a split-drive system was used to "mimic the release of a complete drive". In the study, mosquitoes carrying the drive element (i.e. the gRNA) were introduced into a population homozygous for the Cas9 element over several generations. This design is likely not representative of a real-world scenario and, as the authors state, likely exaggerates fitness costs. This is because the females carrying Cas9 will maternally deposit Cas9 protein into her eggs, with activity spanning several generations. When mated with a drive-carrying male the gRNA will immediately co-exist with maternally deposited Cas9, leading to early somatic cleavage and significant fitness costs (reflected in the author's own fertility crosses). This is fundamentally different to how a complete drive would function in a real-world release, where complete-drive males would mate with wild-type females not carrying Cas9. Their offspring would carry the drive element but would not be exposed to maternally deposited cas9, thus deleterious maternal effects would only begin to appear in the subsequent generation from females carrying the drive. Fitness costs measured from split-drive designs are therefore likely substantially overestimated compared to what would occur during the initial but critical release phase of a complete drive. This flaw weakens the paper's ability to predict the failure or success of the screened targets in a complete drive design, thus weakening the interpretation of the results from the cage trials. As a suggestion for improvement, the authors should explicitly and more prominently discuss the limitations of their split-drive model compared to complete drive models, both in the Results and Discussion. It is also recommended to include a schematic for both strategies that contrasts the experimental setup design (i.e. release of the drive into a Cas9 homozygous background) with a complete-drive release, clearly illustrating differences in maternal deposition pathways. This will not only contextualise the results and support the author's conclusion that observed fitness costs are likely an overestimate but will further strengthen the arguments that the candidate target genes found in this study may still be viable in a complete-drive system.

      -We sincerely appreciate the thoughtful review and the valuable comments and suggestions provided, which have helped improve both the clarity and readability of this study. We have revised several parts in the discussion of the manuscript and hope that these changes adequately address the concerns raised. We have also made Figure S5 to illustrate the differences between two release strategies (biparental-Cas9 split drive in our study and complete drive in real release).

      Please note that this type of fitness cost may have partially undermined our cage study (the fitness effect is notable, but still small compared to total fitness costs), but this is also among the first studies to propose and investigate this phenomenon in the first place (it is also noted in another preprint from our lab but to our knowledge not proposed elsewhere). Thus, part of the impact of our manuscript is showing that this is important, which may inform future cage studies in our lab and elsewhere.

      A second weakness in the manuscript relates to its limited explanation and discussion of key concepts. For example, the manuscript reports a stark difference in outcome of the two stl-targeting drives, where a high initial release in cage 1 led to population elimination versus a failure of the drive to spread in cage 2. The authors attribute this to vague "allele effects" and stochastic factors such as larval competition; however the results appear reminiscent of the Allee effect, which is a well-characterised phenomenon describing the correlation of population size (or density) and individual fitness (or per capita population growth rate). Using their results as an example, is it plausible that the high-frequency initial release in cage 1 imposed enough genetic load to quickly drive the population density below the Allee threshold thus quickly leading to population eradication. In cage 2, the low-frequency at initial release was insufficient to cross the Allee threshold. Omitting mention of this ecological principal greatly weakens the Discussion, and further presents a missed opportunity to discuss one of the more crucial strengths of the paper - that is, in providing a deeper insight into the practical requirements for successful field implementation.

      -While we do indeed mention this Allee effect (the “allele effect” noted above is a misspelling that we have corrected), we were hesitant to give it much discussion, considering that the specific Allee effect in our cages is likely of a very different nature than one would find in nature (we explain that it is likely due to bacterial growth that occurs when fewer larvae are present). However, it is perhaps still a good excuse to cover it in the discussion, while still noting that the specific Allee effect in our cage may not be representative. We have added the following text: “Nonetheless, the successful result in the cage with high release study may point to a potential field strategy for a drive that is less efficient (perhaps even one found to be less efficient in initial field tests compared to laboratory tests). If the initial release frequency of the drive is sufficiently high and widespread, then short-term high genetic load may substantially reduce the population, perhaps enough for Allee effects to become important. At this point, even if average genetic load is slowly declining without additional drive releases, persistent moderate genetic load coupled with the Allee effect may be sufficient to ensure population elimination.”

      In a similar vein, the authors provide only a superficial mechanistic discussion into the fitness costs associated with drives targeting key candidate genes. The paper would benefit from a deeper discussion regarding the specific molecular functions of top-performing genes (stl, oct, nox) and how unintended Cas9 activity could disrupt their activity, integrating known molecular functions with observed fitness costs. For instance, oct encodes a G-protein coupled receptor essential for ovulation and oviduct muscle relaxation, thus disruption to the oct gene would directly impair egg-laying which would account for the observed phenotypic effects. A deeper discussion linking unintended Cas9 activity to the specific, sensitive functions of target genes would elevate the paper from a descriptive screen to a more insightful mechanistic study.

      -We appreciate the reviewer’s comment. We have added a discussion to further explain fitness cost caused by unintended Cas9 activity disrupting target gene functions. However, keep in mind that the exact timing of Cas9 cleavage and the exact timing of these gene’s essential functions is still somewhat uncertain, which may limit insights from this line of analysis compared to a situation where ideal, high quality data is available for both of these. Here is the new material in the discussion:

      “The functions of the top-performing genes suggests a mechanistic basis for the observed fitness costs. Aside from germline cells, nanos has expression in other ovary cells as well. CG4415 lacks this expression, but our Cas9 construct with this promoter may have a different expression pattern that the native gene, as evidenced by its support for good drive conversion in females. stl is essential for ovarian follicle development, and its disruption likely in non-germline ovary cells could compromise egg chamber development and fertility. oct encodes the octopamine β2 receptor, a G-protein coupled receptor critical for ovulation and fertilization, so if it were similarly lost, egg-laying would be directly impaired. nox, which encodes NADPH oxidase, contributes to calcium flux and smooth muscle contraction during ovulation, so its disruption may prevent egg laying. tra is needed in the whole body for sexual development, but may also play an important role in ovary function. Thus, unintended Cas9 activity at these non-germline ovary cells can directly interfere with sensitive reproductive functions, potentially explaining the fertility costs observed in drive carriers. This issue could potentially be overcome if promoters were available that were truly restricted to germline cells rather than other reproductive cells, though it remains unclear if such promoters both exist and would retain their expression pattern at a non-native locus.”

      It is curious that the authors chose two genes on the X chromosome as targets. In insects (such as Drosophila here) that have heterogametic sex chromosomes, homing is not possible in the heterogametic sex as there is no chromosome to home to - so there will be no homing in males. On top of that, there is usually some fitness effect in carrier (heterozygous) females, so in a population these are nearly always bad targets for drives - unless there is some other compelling reason to choose that target?

      -Our rationale for testing X-linked targets is twofold. First, these genes are likely to play important roles in sex-specific functions and may have a different expression pattern (which is why specifically Dec was included), potentially reducing fitness costs. Although homing cannot occur in males, if drive conversion at these sites in females is very high and fitness costs are minimal, the resulting genetic load could still be sufficient to suppress populations (thus, such candidates could be superior even in diploids if they happen to have a lower fitness costs). Second, X-linked targets may have broader relevance for suppression drives in haplodiploid pests (e.g., fire ants), which has the same population dynamics as an X-linked target in a diploid populations. Our results therefore could have provided useful insights for such scenarios (such as for fire ants: Liu et al., bioRxiv 2025) if drive performance was sufficient for followup testing.

      Minor comments

      • Enhanced clarity in the Figures and data presentation would greatly improve readability. For example, Figure 5 is critical yet difficult to interpret; consider changing x-axis labels from icons to explicit text (e.g. "biparental Cas9", "maternal cas9", "paternal Cas9"). Similarly, Figure 4 is difficult to read and the y-axis label "population size" is ambiguous; consider adding shapes or dashes (rather than relying solely on colour) and clarifying the y-axis (e.g. no. adults collected) in the legend.

      -We appreciate the reviewer’s comment and have revised Figure 4 as suggested. Regarding Figure 5, we attempted to replace the icons with text labels; however, this was not possible because there is very little horizontal space and two generations to specify. Instead, we have revised the figure legend to provide a clearer explanation, which can hopefully improve clarity..

      • Expand on or include a schematic to show the differences in construction between the tra-v1 and tra-v2 constructs to better contextualise the discrepancies in results (e.g. inheritance rates of 61%-66% for tra-v1 and 81%-83% for tra-v2 between the two.

      -We have expanded Figure 2 to compare the constructs of tra-v1 and tra-v2. The further explanation of these two constructs was added into the result section: ‘When targeting tra, we originally tested the 4-gRNA construct tra-v1. However, the drive inheritance rate was relatively low (61%-66%), and sequencing revealed that only the middle two gRNAs were active (Table S3). Lack of cleavage at the outmost sites is particularly detrimental to achieving high drive conversion. Therefore, a second construct tra-v2 was tested that retained the two active gRNAs and included two new gRNAs. It showed substantially improved drive inheritance (81%-83%). ’

      • Minor typos e.g.:

      o Line 87: "form" to "from"

      o Line 484: "expended" to "expanded

      o Line 560: "foor" to "for"

      o Line 732: "conversed" to "conserved

      -We have revised these typos.

      • Clarify the split drive system: the authors introduce split drive for the first time in Line 118. They should at least give a clear definition and explanation of split drive and complete drive in the introduction.

      -We have included an introduction of split drive and complete drive in the introduction (line 47-53).

      • Line 237-238., The fitness evaluation lacks a clear description of controls. How were non-drive flies generated and validated as controls?

      -Drive heterozygotes were crossed with Cas9 homozygotes to generate the flies used for fitness evaluation. From the same cross, non-drive progeny were obtained and used as controls, ensuring they shared a comparable genetic background and rearing conditions with the drive-carrying individuals. We have now clarified in the manuscript results that “these served as the controls because they had the same environment and parents as the drive flies”.

      • Line 409-412.,line 423.,The high inheritance rates of stl and oct drives are impressive; however, variation in results across Cas9 promoters should be explained further in the discussion.

      -In the discussion section (lines 751-765), we included a dedicated paragraph addressing the variation observed between the nanos and CG4415 promoters. We have now expanded it to briefly note some differences:

      “Our previous works showed that both nanos and CG4415 have high drive conversion rates8, but nanos failed to suppress target populations in a homing drive targeting the female fertility gene yellow-G due to its fitness cost in drive females27. CG4415 had much lower maternal deposition, which allowed the elimination of cage populations by targeting yellow-G8. Here, we tested both promoters with drives targeting oct and stl, with both showing slightly higher drive efficiency than the drive targeting yellow-G in small-scale crosses. CG4415 has slightly worse though still good performance in females, likely due to male-biased expression compared to nanos.”

      • Line 414: The CG4415 promoter yielded reduced drive conversion rates in females, yet is still referred to as a promising promoter. This conclusion seems optimistic and should be clarified/more justified.

      -Based on our previous study cited in this context, CG4415 shows relatively lower germline conversion rates compared to nanos, although still remaining at a high level. Importantly, CG4415 also exhibits reduced maternal deposition relative to nanos, which could help mitigate fitness costs associated with maternal deposition—an important consideration for suppression systems. Taken together, while its conversion efficiency is lower (but only slightly), the potential benefits of reduced maternal deposition and perhaps even fitness costs provide a rationale for regarding CG4415 as a promising promoter. We state this when first introducing the promoter in the “Drive efficiency assessment” results subsection.

      • Specify the number of flies released, sex ratio, and cage size per generation (Line 466). This is essential for reproducibility.

      -We appreciate the reviewer’s comment and have revised the text to clarify our release approach, which differed from that used in other studies (which tend to have substantial fitness differences between lines in the first generation that can complicate analysis and change results). Rather than directly releasing drive males or females into cages, we first crossed drive males with non-drive females and then mixed them with non-drive females mated to non-drive males. The offspring (including males and females) from these crosses were recorded as the G0 generation, and their ratios were recorded as release frequency. We have specified the release ratio adult numbers in the following paragraph and supplementary file.

      Reviewer #2 (Significance (Required)):

      Overall the manuscript presents a valuable and timely resource for gene drive research, in particular for its systematic appraisal of potential target genes for population suppression drives and its rigorous assessment of the impact of maternal Cas9 deposition. The value in the generation and empirical testing of a novel multiplexed stl-targeting gene drive that led to population eradication in a cage trial should not be understated. While several key aspects of the discussion of the manuscript should be strengthened, the study presents a meaningful contribution to the field, extending previous work and and outlines important considerations for the design and implementation of effective gene drive systems.

      -We thank the reviewer for their encouraging and constructive comments. We are pleased that the systematic evaluation of target genes, the analysis of maternal Cas9 deposition, and the multiplexed stl-targeting drive were recognized as valuable contributions. We have strengthened the discussion as suggested, and we believe these revisions further enhance the manuscript as an aid for the design and implementation of future gene drive systems.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this study, Xu and colleagues explored how CRISPR-based homing gene drives could be used to suppress insect populations by targeting female fertility genes in Drosophila melanogaster. They engineered split gene drives with multiplexed guide RNAs to target nine candidate genes, seeking to prevent functional resistance and achieve high drive conversion with minimal fitness costs.

      Here my comments about this work:

      Abstract: While the stated aim of the study on line 16 is to "maintain high drive conversion efficiency with low fitness costs in female drive carriers," the conclusion in lines 29-31 shifts focus toward the broader challenges and future optimization of gene drive systems. This conclusion does not clearly highlight the specific results of the study or how they relate directly to the original objective. It would be more effective to emphasize the actual findings, such as which target genes performed best and under what conditions, and how these findings support or contradict the stated goals. The study primarily aimed to assess the efficiency of specific female fertility genes and to evaluate strategies for minimizing the formation of functional resistance alleles, rather than proposing a protocol for optimization. Therefore, better alignment is needed between the study's aim, experimental design, and concluding statements. Clarifying this alignment would also help refine the paper's focus and more accurately communicate its contribution, including whether it is exploratory, comparative, or methodologically driven.

      -We have revised the abstract to clarify the alignment as suggested by the reviewer. We note that this discrepancy is due to the initial aim of our study being different than some of the important lessons learned along the way regarding fitness effects from Cas9 deposition in split drives. Still, we agree that it would be better to be more consistent in our wording and conclusions.

      Introduction: One of the key design elements in this study is the use of multiplexed gRNAs. It is reasonable to assume that this strategy may influence fitness costs, potentially in more than one way. Given that assessing fitness cost is a major focus of the study, it would be helpful to include a brief discussion of previous research examining how multiplexed gRNAs may impact fitness in gene drive systems. A short review of relevant studies, if available, would provide important context for interpreting the results and could help clarify whether any observed fitness costs might be attributed, at least in part, to the multiplexing strategy itself. This addition could be appropriately placed around line 102, where gRNA design is discussed.

      -We have added an explanation in the Discussion to mention this. However, it has not been conclusively shown that multiplexed gRNAs have any effect on fitness. Indeed, there have been some multiplexed constructs that seem to have no fitness effect, and some that have high fitness costs. This doesn’t rule out the potential for multiplexed gRNAs to influence fitness itself, but it means that the mechanism may be complex. The new text reads:

      “Another potential though unconfirmed source of fitness cost arises from increased cleavage events associated with multiplexed gRNAs, where the greater number of gRNAs can enhance the overall cut rate compared to single-gRNA designs.”

      Line 42: Cas12a also showed efficacy using gene drives in yeast and Drosophila.

      -We now mention Cas12a at the beginning of the introduction.

      Line 133: The paragraph begins by stating that homologs of the target genes were identified and aligned. To improve clarity, especially for readers who are new to gene drive research, it would be helpful to begin the paragraph with a brief introductory sentence explaining the purpose of this step. For example, you could state the importance of identifying and aligning homologs to assess the conservation of target sites across species, which is critical for evaluating the broader applicability of gene drive strategies. This context would guide the reader and clarify the relevance of the analysis.

      -We have added the explanation as suggested.

      Lines 144-145: You mention that "the exception was tra, for which two constructs containing different gRNA sets were generated." For clarity, it would be helpful to provide a brief explanation of why two different gRNA sets were used for tra, and whether this differs from the approach taken with the other target genes. It's currently unclear whether all other genes were targeted using a single, standardized set of gRNAs, and this should be explicitly stated here for consistency, even though it is mentioned later in the plasmid construction section. Additionally, I suggest combining the sections on gRNA target design and plasmid construction. Since these components are closely related and sequential in the experimental workflow, presenting them together would improve the logical flow and help readers follow the methodology more smoothly.

      -We have combined both the gRNA target design and plasmid construction sections. We also discuss the two tra constructs early in the results section (see response to reviewer 2).

      Line 210: The analysis of the cage experiments was based on models from previous studies that used a simplified assumption of a single gRNA at the target site. While I understand this approach has precedent, it raises important questions about potential limitations. Specifically, could simplifying the analysis to one gRNA affect the conclusions of this study, given that the experimental design involves multiplexed gRNAs with four distinct target sites? The implications of using this simplified model should be clearly addressed, as the dynamics of drive efficiency, resistance formation, and fitness effects may differ when multiple gRNAs are employed. Additionally, while I am not a statistician, it is worth asking whether more sophisticated modeling approaches could be applied to account for all four gRNAs, rather than reducing the system to a single-gRNA framework. A discussion of the modeling choices and their potential consequences would strengthen the interpretation of the results.

      -We have clarified this. While we have modeled multiple gRNAs with high fidelity in SLiM, the maximum likelihood method is not very amenable to such treatment. It may cause our fitness estimate to be a small overestimate, but give the low fitness inferences, would certainly not have a large enough effect to fundamentally change any conclusion (and should be of a consistent level across all cages). We now discuss this in the methods section.

      Lines 297-300: Your results show that the expression of all target genes was higher in females, except for oct, which had higher expression in males. Additionally, oct expression decreased in adults. Given that oct is functionally important for ovulation and fertilization, processes that are primarily required in adult females, this pattern is somewhat unexpected. Could there be a possible explanation for the lower expression of oct, particularly in females and especially in adults, where its function would presumably be most critical? A brief discussion or hypothesis addressing this discrepancy would help clarify the biological relevance and interpretation of the expression data.

      -Based on transcriptome data from FlyBase, derived from Graveley et al. (2011), Oct is indeed expressed slightly higher in adult males than in adult females. This difference may be attributed to the fact that the female flies used in the study were virgins; Oct expression could be upregulated post-mating to mediate ovulation. Additionally, Oct is expressed not only in reproductive tissues but also in other organs such as the nervous system, where sex-specific differences in cell type composition or neural activity may contribute to the observed expression bias. However, high expression does not necessarily correlate with essential expression. Though Oct could have multiple functions, it’s still possible that the only apparent phenotype upon knockout is female sterility. We have added the following text: “This male-biased expression may result from the use of virgin females in the dataset, as oct is likely upregulated after mating. Moreover, oct is also expressed in non-reproductive tissues such as the nervous system, which may contribute to sex-specific differences in expression38. While oct may have multiple functions, it is possible that it is only essential for female fertility.”

      Lines 346-347: What is the distance between the gRNA target sites within each gene? Are all of the gRNAs confirmed to be active? It would be valuable to include a table summarizing the distance between target sites for each gene, the activity levels of the individual gRNAs, and the corresponding homing rates. This would help determine whether there is a correlation between gRNA spacing and drive efficiency. For example, Lopez del Amo et al. (Nature Communications, 2020) demonstrated that even a 20-nucleotide mismatch at each homology arm can significantly reduce drive conversion. Including such a comparative analysis in your study could provide important insights into how gRNA arrangement influences overall drive performance and would be incredibly helpful for future multiplexing designs.

      -We have showed previously that close spacing of gRNAs should help maintain high drive conversion efficiency, and this is alluded to indirectly in the introduction (we now mention it more directly). In our study, gRNAs were positioned in close proximity without overlap, with the general distance between the outermost cut sites within each gene being We have added a summary table (Table S3) presenting the sequencing results, which also showed gRNA activity levels. Notably, most but not all gRNAs were active, at least for embryo resistance (low to moderate activity may still be present in the germline). Coupled with varying activity levels for those that were active, this likely contributed to reduced drive conversion due to mismatches at the homology arms. This observation supports the notion that drive performance could be optimized by selecting and arranging more active gRNAs. Consistent with this, our second construct targeting tra (tra-v2) exhibited a higher inheritance rate than the original construct, suggesting that gRNA arrangement and activity critically influence drive efficiency. Testing the activity of every single gRNA requires the construction of multiple gRNA lines, since in vitro or ex vivo tests will not be accurate as in vivo transformation test. However, in our study, as long as drive conversion rates were reasonably high, further optimization was not needed. Therefore, the multiplexing gRNA design can not only maximize drive conversion, but also reduce labor filtering an increased number of 1-gRNA designs with lower performance.

      Line 434: I was not able to find any sequencing data. This is important to evaluate gRNA activities and establish correlations with drive efficiency.

      -We have added a summary of the sequencing results in Table S3, though these are for embryo resistance alleles. Note that while high gRNA activity is correlated with high drive inheritance, these are not directly related. For suppression drives, germline resistance rates are usually of low importance compared to drive inheritance, so we did not assess these in detail (and pessimistically assumed complete germline resistance in our cage models).

      Line 482: Did the authors test Cas9-only individuals (without the drive) against a wild-type population? This would help determine whether Cas9 alone has any unintended fitness effects. Additionally, is Cas9 expression stable over time and across generations? It would be helpful to include any observations or thoughts on the long-term stability and potential fitness impact of Cas9 in the absence of the drive element.

      -We did not perform a direct comparison of Cas9-only individuals and wild-type flies in this study. However, previous studies (Champer et al., Nature Communications, 2020 - Langmuller et al., eLife 2022), which we now cite in the discussion, found no significant fitness difference between very similar Cas9-expressing lines and wild type in the absence of a drive element, indicating no significant fitness impact from Cas9 alone (though we cannot exclude a small effect, it certainly could not come close to explaining our results). In our experiments, Cas9 expression was generally stable across generations, as indicated by consistent drive inheritance and fertility test results obtained from independent batches. Separate from this study, we did observe rare instability in one nanos-Cas9 line, which had remained stable for over five years but recently became inactive (low population maintenance size may have caused stochastic removal of the functional allele). It is something to watch out for, but probably not on the timescale of a single study.

      Discussion: I would appreciate a more direct and clearly stated conclusion that summarizes the key findings of the study. While the discussion addresses the main outcomes in depth, presenting a concise concluding paragraph, either at the end of the discussion or as a standalone conclusion section, would provide a stronger and more definitive closing statement. This would help reinforce what the study ultimately achieved and ensure the main takeaways are clearly communicated to the reader.

      -We have revised and expanded the last paragraph of the discussion section to make our findings more direct and clear.

      Overall, I believe this is an important study that offers valuable insights for advancing the design of CRISPR-based gene drives. The findings contribute to the development of more efficient and practical gene drive prototypes, bringing the field closer to real-world applications.

      Reviewer #3 (Significance (Required)):

      In this study, Xu and colleagues explored how CRISPR-based homing gene drives could be used to suppress insect populations by targeting female fertility genes in Drosophila melanogaster. They engineered split gene drives with multiplexed guide RNAs to target nine candidate genes, seeking to prevent functional resistance and achieve high drive conversion with minimal fitness costs. Among the targets, the stall (stl) and octopamine β2 receptor (oct) genes performed better, showing the highest inheritance rates in lab crosses. When tested in population cages, the stl drive was able to completely eliminate a fly population, but only when released at a high enough frequency, while other cages failed. These failures were traced and explained by fitness cost in drive-carrying females, caused largely by maternally deposited Cas9, which led to embryo resistance and reduced fertility. Through additional fertility assays and modeling, the team confirmed that the origin and timing of Cas9 expression, particularly from mothers, significantly impacted drive success. Surprisingly, even when Cas9 was driven by promoters with supposedly low somatic activity, such as nanos, fitness still persisted. The study revealed that while gene drives can be powerful, their effectiveness relies on finely balanced factors like promoter choice, drive architecture, and gene function. Overall, the research offers valuable lessons for designing robust, next-generation gene drives aimed at ecological pest control.

      -We sincerely appreciate the reviewer’s positive and thoughtful comments. We agree that the points raised highlight the importance of our findings and hope that our revisions have further improved both the clarity and overall content of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study, Xu and colleagues explored how CRISPR-based homing gene drives could be used to suppress insect populations by targeting female fertility genes in Drosophila melanogaster. They engineered split gene drives with multiplexed guide RNAs to target nine candidate genes, seeking to prevent functional resistance and achieve high drive conversion with minimal fitness costs.

      Here my comments about this work:

      Abstract: While the stated aim of the study on line 16 is to "maintain high drive conversion efficiency with low fitness costs in female drive carriers," the conclusion in lines 29-31 shifts focus toward the broader challenges and future optimization of gene drive systems. This conclusion does not clearly highlight the specific results of the study or how they relate directly to the original objective. It would be more effective to emphasize the actual findings, such as which target genes performed best and under what conditions, and how these findings support or contradict the stated goals. The study primarily aimed to assess the efficiency of specific female fertility genes and to evaluate strategies for minimizing the formation of functional resistance alleles, rather than proposing a protocol for optimization. Therefore, better alignment is needed between the study's aim, experimental design, and concluding statements. Clarifying this alignment would also help refine the paper's focus and more accurately communicate its contribution, including whether it is exploratory, comparative, or methodologically driven.

      Introduction: One of the key design elements in this study is the use of multiplexed gRNAs. It is reasonable to assume that this strategy may influence fitness costs, potentially in more than one way. Given that assessing fitness cost is a major focus of the study, it would be helpful to include a brief discussion of previous research examining how multiplexed gRNAs may impact fitness in gene drive systems. A short review of relevant studies, if available, would provide important context for interpreting the results and could help clarify whether any observed fitness costs might be attributed, at least in part, to the multiplexing strategy itself. This addition could be appropriately placed around line 102, where gRNA design is discussed.

      Line 42: Cas12a also showed efficacy using gene drives in yeast and Drosophila.

      Line 133: The paragraph begins by stating that homologs of the target genes were identified and aligned. To improve clarity, especially for readers who are new to gene drive research, it would be helpful to begin the paragraph with a brief introductory sentence explaining the purpose of this step. For example, you could state the importance of identifying and aligning homologs to assess the conservation of target sites across species, which is critical for evaluating the broader applicability of gene drive strategies. This context would guide the reader and clarify the relevance of the analysis.

      Lines 144-145: You mention that "the exception was tra, for which two constructs containing different gRNA sets were generated." For clarity, it would be helpful to provide a brief explanation of why two different gRNA sets were used for tra, and whether this differs from the approach taken with the other target genes. It's currently unclear whether all other genes were targeted using a single, standardized set of gRNAs, and this should be explicitly stated here for consistency, even though it is mentioned later in the plasmid construction section. Additionally, I suggest combining the sections on gRNA target design and plasmid construction. Since these components are closely related and sequential in the experimental workflow, presenting them together would improve the logical flow and help readers follow the methodology more smoothly.

      Line 210: The analysis of the cage experiments was based on models from previous studies that used a simplified assumption of a single gRNA at the target site. While I understand this approach has precedent, it raises important questions about potential limitations. Specifically, could simplifying the analysis to one gRNA affect the conclusions of this study, given that the experimental design involves multiplexed gRNAs with four distinct target sites? The implications of using this simplified model should be clearly addressed, as the dynamics of drive efficiency, resistance formation, and fitness effects may differ when multiple gRNAs are employed. Additionally, while I am not a statistician, it is worth asking whether more sophisticated modeling approaches could be applied to account for all four gRNAs, rather than reducing the system to a single-gRNA framework. A discussion of the modeling choices and their potential consequences would strengthen the interpretation of the results.

      Lines 297-300: Your results show that the expression of all target genes was higher in females, except for oct, which had higher expression in males. Additionally, oct expression decreased in adults. Given that oct is functionally important for ovulation and fertilization, processes that are primarily required in adult females, this pattern is somewhat unexpected. Could there be a possible explanation for the lower expression of oct, particularly in females and especially in adults, where its function would presumably be most critical? A brief discussion or hypothesis addressing this discrepancy would help clarify the biological relevance and interpretation of the expression data.

      Lines 346-347: What is the distance between the gRNA target sites within each gene? Are all of the gRNAs confirmed to be active? It would be valuable to include a table summarizing the distance between target sites for each gene, the activity levels of the individual gRNAs, and the corresponding homing rates. This would help determine whether there is a correlation between gRNA spacing and drive efficiency. For example, Lopez del Amo et al. (Nature Communications, 2020) demonstrated that even a 20-nucleotide mismatch at each homology arm can significantly reduce drive conversion. Including such a comparative analysis in your study could provide important insights into how gRNA arrangement influences overall drive performance and would be incredibly helpful for future multiplexing designs.

      Line 434: I was not able to find any sequencing data. This is important to evaluate gRNA activities and establish correlations with drive efficiency.

      Line 482: Did the authors test Cas9-only individuals (without the drive) against a wild-type population? This would help determine whether Cas9 alone has any unintended fitness effects. Additionally, is Cas9 expression stable over time and across generations? It would be helpful to include any observations or thoughts on the long-term stability and potential fitness impact of Cas9 in the absence of the drive element.

      Discussion: I would appreciate a more direct and clearly stated conclusion that summarizes the key findings of the study. While the discussion addresses the main outcomes in depth, presenting a concise concluding paragraph, either at the end of the discussion or as a standalone conclusion section, would provide a stronger and more definitive closing statement. This would help reinforce what the study ultimately achieved and ensure the main takeaways are clearly communicated to the reader.

      Overall, I believe this is an important study that offers valuable insights for advancing the design of CRISPR-based gene drives. The findings contribute to the development of more efficient and practical gene drive prototypes, bringing the field closer to real-world applications.

      Significance

      In this study, Xu and colleagues explored how CRISPR-based homing gene drives could be used to suppress insect populations by targeting female fertility genes in Drosophila melanogaster. They engineered split gene drives with multiplexed guide RNAs to target nine candidate genes, seeking to prevent functional resistance and achieve high drive conversion with minimal fitness costs. Among the targets, the stall (stl) and octopamine β2 receptor (oct) genes performed better, showing the highest inheritance rates in lab crosses. When tested in population cages, the stl drive was able to completely eliminate a fly population, but only when released at a high enough frequency, while other cages failed. These failures were traced and explained by fitness cost in drive-carrying females, caused largely by maternally deposited Cas9, which led to embryo resistance and reduced fertility. Through additional fertility assays and modeling, the team confirmed that the origin and timing of Cas9 expression, particularly from mothers, significantly impacted drive success. Surprisingly, even when Cas9 was driven by promoters with supposedly low somatic activity, such as nanos, fitness still persisted. The study revealed that while gene drives can be powerful, their effectiveness relies on finely balanced factors like promoter choice, drive architecture, and gene function. Overall, the research offers valuable lessons for designing robust, next-generation gene drives aimed at ecological pest control.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Paper summary

      The manuscript by Xu. et al presents an insightful and valuable contribution to the field of gene drive research. The manuscript by Xu et al. presents an insightful and valuable contribution to the field of gene drive research. The strategy of targeting and disrupting female fertility genes using selfish homing genetic elements was first proposed by Burt in 2003. However, for this approach to be effective, the phenotypic constraints associated with gene disruption have meant that the pool of suitable target genes remains relatively small - notwithstanding the significant expansion in accessible targets enabled by CRISPR-based genome editing nucleases. Population suppression gene drives are well developed as proof-of-principle systems, with some now in the late stages of development as genetic control strains. However, advancing the pipeline will require a broader set of validated target genes - both to ensure effectiveness across diverse species and to build redundancy into control strategies, reducing reliance on any single genetic target. In their paper, the authors conduct a systematic review of nine female fertility genes in Drosophila melanogaster to assess their potential as targets for homing-based suppression gene drives. The authors first conduct a thorough bioinformatic review to select candidate target genes before empirically testing candidates through microinjection and subsequent in vivo analyses of drive efficiency, population dynamics, and fitness costs relating to fecundity and fertility. After finalising their results, the authors identify two promising candidate target genes - oct and stl - which both demonstrate high gene conversion rates and, regarding the latter, can successfully suppress a cage population at a high release frequency. However, the manuscript suffers from a lack of in-depth discussion of a key limitation in its experimental design - namely, that the authors utilise a split-drive design to assess population dynamics and fitness effects when such a drive will not reflect release scenarios in the field. The review below highlights some major strengths and weaknesses of the paper, with suggestions for improvement.

      Key strengths

      The study's most significant strength is in its systematic selection and empirical testing of nine distinct genes as targets for homing-based gene drive, hence providing a valuable resource that substantially expands the pool of potential targets beyond the more commonly studied target genes (e.g. nudel, doublesex, among others). The identification of suitable target genes presents a significant bottleneck in the development of gene drives and the work presented here provides a foundational dataset for future research. The authors bolster the utility of their results by assessing the conservation of candidate genes across a range of pest species, suggesting the potential for broader application. A key finding in the paper is the successful suppression of a cage population using a stl-targeting gene drive (albeit at a high release frequency). This provides a critical proof-of-principal result demonstrating that stl is a viable target for a suppression drive. While in the paper suppression was not possible at lower release frequencies, together, the results provide evidence for complex population dynamics and threshold effects that may govern the success or failure of a gene drive release strategy - hence moving the conversation from a technical perspective ("can it work") to how a gene drive may be implemented. Moreover, the authors also employ a multiplexed gRNA strategy for all their gene drive designs and in particular their population suppressive gene drive targeting stl. This provides further proof-of-principal evidence for multiplexed gRNAs in order to combat the evolution of functional resistance following gene drive deployment. Finally, a further strength of this paper is in the clever dissection of fitness effects resulting from maternal Cas9 deposition. The authors design and perform a robust set of crosses to elucidate the parental source of fitness effects (i.e. maternally, paternally, or biparentally derived Cas9), finding (as they and others have before) that embryonic fitness was significantly reduced when Cas9 was inherited from a maternal source. As discussed, the authors conclude that maternal deposition is particularly pronounced in the context of split drives as opposed to complete drives, with the implication being that a complete drive might succeed where a split-drive has failed; thus providing a key directive for future study.

      Concerns

      The manuscript's central weakness lies in its interpretation of the results from the cage experiments - namely that a split-drive system was used to "mimic the release of a complete drive". In the study, mosquitoes carrying the drive element (i.e. the gRNA) were introduced into a population homozygous for the Cas9 element over several generations. This design is likely not representative of a real-world scenario and, as the authors state, likely exaggerates fitness costs. This is because the females carrying Cas9 will maternally deposit Cas9 protein into her eggs, with activity spanning several generations. When mated with a drive-carrying male the gRNA will immediately co-exist with maternally deposited Cas9, leading to early somatic cleavage and significant fitness costs (reflected in the author's own fertility crosses). This is fundamentally different to how a complete drive would function in a real-world release, where complete-drive males would mate with wild-type females not carrying Cas9. Their offspring would carry the drive element but would not be exposed to maternally deposited cas9, thus deleterious maternal effects would only begin to appear in the subsequent generation from females carrying the drive. Fitness costs measured from split-drive designs are therefore likely substantially overestimated compared to what would occur during the initial but critical release phase of a complete drive. This flaw weakens the paper's ability to predict the failure or success of the screened targets in a complete drive design, thus weakening the interpretation of the results from the cage trials. As a suggestion for improvement, the authors should explicitly and more prominently discuss the limitations of their split-drive model compared to complete drive models, both in the Results and Discussion. It is also recommended to include a schematic for both strategies that contrasts the experimental setup design (i.e. release of the drive into a Cas9 homozygous background) with a complete-drive release, clearly illustrating differences in maternal deposition pathways. This will not only contextualise the results and support the author's conclusion that observed fitness costs are likely an overestimate but will further strengthen the arguments that the candidate target genes found in this study may still be viable in a complete-drive system.

      A second weakness in the manuscript relates to its limited explanation and discussion of key concepts. For example, the manuscript reports a stark difference in outcome of the two stl-targeting drives, where a high initial release in cage 1 led to population elimination versus a failure of the drive to spread in cage 2. The authors attribute this to vague "allele effects" and stochastic factors such as larval competition; however the results appear reminiscent of the Allee effect, which is a well-characterised phenomenon describing the correlation of population size (or density) and individual fitness (or per capita population growth rate). Using their results as an example, is it plausible that the high-frequency initial release in cage 1 imposed enough genetic load to quickly drive the population density below the Allee threshold thus quickly leading to population eradication. In cage 2, the low-frequency at initial release was insufficient to cross the Allee threshold. Omitting mention of this ecological principal greatly weakens the Discussion, and further presents a missed opportunity to discuss one of the more crucial strengths of the paper - that is, in providing a deeper insight into the practical requirements for successful field implementation. In a similar vein, the authors provide only a superficial mechanistic discussion into the fitness costs associated with drives targeting key candidate genes. The paper would benefit from a deeper discussion regarding the specific molecular functions of top-performing genes (stl, oct, nox) and how unintended Cas9 activity could disrupt their activity, integrating known molecular functions with observed fitness costs. For instance, oct encodes a G-protein coupled receptor essential for ovulation and oviduct muscle relaxation, thus disruption to the oct gene would directly impair egg-laying which would account for the observed phenotypic effects. A deeper discussion linking unintended Cas9 activity to the specific, sensitive functions of target genes would elevate the paper from a descriptive screen to a more insightful mechanistic study.

      It is curious that the authors chose two genes on the X chromosome as targets. In insects (such as Drosophila here) that have heterogametic sex chromosomes, homing is not possible in the heterogametic sex as there is no chromosome to home to - so there will be no homing in males. On top of that, there is usually some fitness effect in carrier (heterozygous) females, so in a population these are nearly always bad targets for drives - unless there is some other compelling reason to choose that target?

      Minor comments

      • Enhanced clarity in the Figures and data presentation would greatly improve readability. For example, Figure 5 is critical yet difficult to interpret; consider changing x-axis labels from icons to explicit text (e.g. "biparental Cas9", "maternal cas9", "paternal Cas9"). Similarly, Figure 4 is difficult to read and the y-axis label "population size" is ambiguous; consider adding shapes or dashes (rather than relying solely on colour) and clarifying the y-axis (e.g. no. adults collected) in the legend.
      • Expand on or include a schematic to show the differences in construction between the tra-v1 and tra-v2 constructs to better contextualise the discrepancies in results (e.g. inheritance rates of 61%-66% for tra-v1 and 81%-83% for tra-v2 between the two.
      • Minor typos e.g.:
        • Line 87: "form" to "from"
        • Line 484: "expended" to "expanded
        • Line 560: "foor" to "for"
        • Line 732: "conversed" to "conserved
      • Clarify the split drive system: the authors introduce split drive for the first time in Line 118. They should at least give a clear definition and explanation of split drive and complete drive in the introduction.
      • Line 237-238., The fitness evaluation lacks a clear description of controls. How were non-drive flies generated and validated as controls?
      • Line 409-412.,line 423.,The high inheritance rates of stl and oct drives are impressive; however, variation in results across Cas9 promoters should be explained further in the discussion.
      • Line 414: The CG4415 promoter yielded reduced drive conversion rates in females, yet is still referred to as a promising promoter. This conclusion seems optimistic and should be clarified/more justified.
      • Specify the number of flies released, sex ratio, and cage size per generation (Line 466). This is essential for reproducibility.

      Significance

      Overall the manuscript presents a valuable and timely resource for gene drive research, in particular for its systematic appraisal of potential target genes for population suppression drives and its rigorous assessment of the impact of maternal Cas9 deposition. The value in the generation and empirical testing of a novel multiplexed stl-targeting gene drive that led to population eradication in a cage trial should not be understated. While several key aspects of the discussion of the manuscript should be strengthened, the study presents a meaningful contribution to the field, extending previous work and and outlines important considerations for the design and implementation of effective gene drive systems.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Xu et al. investigated split gene drive systems by targeting multiple female essential genes involved in fertility and viability in Drosophila. The authors evaluate the suppression efficiency through individual corsses and cage trials. Resistance allele formation and fitness costs are explored by examining the sterility and fertility of each line. Overall, the experimental design is sound and methods are feasible. The work is comprehensive, and conclusions are well supported by the data. This work offers informative insights that could guide the design of suppression gene drive systems in other invasive disease vectors or agricultural pests.

      However, several points requiring clarification or improvement:

      1. Methodological clarity: Some experimental details are indufficiently described, for example, regarding the setup of genetic crosses involving different Cas9 derivatives. In line 197-198, "the mated females, together with females that were mated with Cas9 only males", it is unclear whether the latter group refers to gRNA-females.
      2. Regarding the inheritance rates, you included the reverse orientation of CG4415-Cas9, as I understood, it means this component is in reverse orientation with fluorescent marker. Since it is standard to design adjacent components in opposite direction to avoid transcriptional interference, the rationale for including this comparison should be better justified.
      3. Embryo resistance is inferred from the percentage of sterile drive females derived from drive mothers. How many female individuals were analysed per line and why deep sequencing was not employed to directly detect resistance alleles.
      4. Masculinisation phenotypes were observed upon disruption of tra gene. How strong intersexes were distinguished from males? What molecular markers were used to determine genetic sex. This information should be clearly provided.
      5. It would be more appropriate to use "hatchability"rather than "fertility" when referring to egg-to-larva viability.
      6. In cage trials, a complete gene drive is mimicked by introducing Cas9 to the background population, but this differs from actual complete gene drive, due to potential effects from separate insertion sites (different chromosome or loci). These difference could impact the system's performance and should be discussed.
      7. Given the large amount of data presented, it would improve readability and interpretation if each result section concluded with a concise summary highlighting the key findings and implications.

      Significance

      The authors evaluate suppression efficiency through individual courses and cage trials. Resistance allele formation and fitness costs are explored by examining the sterility and fertility of each line. Overall, the experimental design is sound and methods are feasible. The work is comprehensive, and conclusions are well supported by the data. This work offers informative insights that could guide the design of suppression gene drive systems in other invasive disease vectors or agricultural pests.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • *

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      • *We appreciate the reviewers' assessment of the significance of our work and would like to highlight where we believe the novelty of this study lies. Our findings identify E4BP4 as a key transcription factor that maintains mitochondrial homeostasis by restraining the overactivation of biological pathways - such as de novo ceramide synthesis - that are known to drive mitochondrial oxidative dysfunction in the context of obesity. We fully acknowledge that the link between C16:0 ceramide and mitochondrial fragmentation has been previously established. However, to our knowledge, our study is the first to connect this phenomenon to a transcriptional safeguard mechanism, thereby providing a new layer of understanding of how transcription factors preserve mitochondrial integrity and function in brown adipocytes. We believe this conceptual advance adds significant value to the field by framing E4BP4 as a transcriptional "guardian" of mitochondrial homeostasis.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *Reviewer #1 comment:

      Figures B: Sample size of EE experiments is too low to draw any meaningful conclusions or to know for certain if the data are reproducible. Small sample sizes, likely coming from one litter and one batch of AAV are prone to type I error.

      Response: We agree with reviewer observation that increasing sample size is essential to confirm reproducibility and robustness. We have therefore planned to repeat the EE experiments with a larger number of mice per group, derived from independent litters and AAV preparations, in order to strengthen the statistical power and validate the phenotype observed in the current study.

      Reviewer #1 comment:

      Figure 3I: Why do cells (none of the groups) show no response to NE stimulation? Please clarify or provide potential mechanistic insight. Perhaps the cells were not differentiated well.

      __ ____Response:__ We agree that the absence of a robust NE response in Figure 3I requires further clarification. To address this, we have planned to repeat the in vitro oxygen consumption assay to confirm the phenotype presented in the study.

      Reviewer #1 comment: Figures 3I vs 5N. There is a striking discrepancy between these panels. In both, cells were treated with palmitate for 6 h, yet the NE and CCCP responses differ significantly. Are these the same cell types and conditions? Please reconcile the differences.

      Response: We would like to clarify that Figures 3I and 5N represent different experimental systems: Figure 3I shows data from primary brown adipocytes with E4bp4 transgene overexpression, whereas Figure 5N shows data from immortalized brown adipocytes with Cas9-mediated mutation of a 65 kb Cers6 enhancer site. Given the distinct cell types and genetic manipulations, a direct comparison between these two panels is not appropriate. Nevertheless, we agree that confirming the consistency of the phenotype across systems is important. To address this, we have planned to repeat oxygen consumption assays in both models to further validate the reproducibility of the observed effects.

      Reviewer #2 comment: A key experiment is missing: does adding C16:0 block the mitochondrial benefits of E4BP4-OE?

      Response: We thank the reviewer for this excellent suggestion. We agree that a rescue experiment is important to directly test whether C16:0 affects the mitochondrial benefits of E4BP4. To address this, we have planned to perform a co-overexpression of E4bp4 and Cers6 in brown adipocytes. The readouts will include mitochondrial morphology and oxygen consumption, enabling us to determine whether restoration of C16:0 production mitigates the protective mitochondria effects of E4BP4 overexpression. This experiment will provide direct mechanistic confirmation of the proposed model.__ __

      __Reviewer #2 comment: __Whether PRDM16-OE mimics the effects of E4BP4 to induce p-Drp1 is not shown.

      __Response: __We thank the reviewer for this valuable suggestion. We agree that testing whether PRDM16 overexpression mimics the effects of E4BP4 on p-Drp1 is important to strengthen the mechanistic link between these transcription factors in terms of regulation of mitochondrial fragmentation. To address this, we have planned to include a Western blot analysis of p-Drp1 in the PRDM16-OE in brown adipocytes.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *Reviewer #1 comment:

      Figure 1F: There is an unexpected dip in gene expression at cold exposure days 3 and 7, followed by a rebound at day 14. Is this fluctuation biologically meaningful or technical?

      Response: We thank the reviewer for this thoughtful observation. A previous study demonstrated that E4bp4 (Nfil3) expression displays an early increase (at 2 hours), followed by a decrease in magnitude - while still remaining significantly higher than control - during beige adipocyte differentiation in response to forskolin treatment (DOI: 10.1016/j.molmet.2022.101619). The authors of that study suggested that E4bp4 may contribute to a second wave of cAMP-driven beige adipocyte differentiation. However, in the context of our work, further discussion on whether the fluctuations in BAT E4bp4 expression observed during cold exposure reflect biological regulation would be speculative. Importantly, despite these oscillations across time points, E4bp4 expression remained statistically significant compared with control, supporting the robustness of our findings. We have now introduced this observation in the Results section of the revised manuscript.

      Reviewer #1 comment: Figures 2H and 2I (GTT): How was the AUC calculated? The GTT and ITT curves appear largely parallel aside from fasting differences. If total AUC was used instead of incremental AUC, it may overstate group differences. The recommended method is outlined in [DOI: 10.1038/s42255-021-00414-7]. Also, since insulin's half-life is ~10 minutes, later differences in the ITT curve likely reflect counterregulatory responses driven by hepatic gluconeogenesis.

      Response: We would like to clarify that in our original manuscript we had already calculated the area of the curve (AOC) rather than the area under the curve (AUC), following the recommended approach (DOI: 10.1038/s42255-021-00414-7). Specifically, the AOC was derived by subtracting the baseline glucose value from each subsequent time point, ensuring that the analysis reflects incremental changes rather than absolute glucose levels. We have now made this description more explicit in the revised version to avoid any ambiguity.

      __Reviewer #1 comment: __Figure 4F: How was mitochondrial fragmentation quantified? Please ensure that the ROI boxes shown in zoomed panels match the same region in size and shape - this applies throughout the manuscript.

      __ _Response: _We thank the reviewer for this valuable comment. To improve the quality and interpretation of the data, we have now included a quantitative analysis of mitochondrial morphology parameters associated with Figure 4F (Figure S4B)__. Specifically, we analyzed:

      • Mitochondrial volume (µm³): reflecting overall mitochondrial size.
      • Surface area (µm²): reflecting membrane expansion.
      • Sphericity index: indicating morphological rounding, which increases with fragmentation.
      • Number of branches and branch junctions per mitochondrion: reflecting mitochondrial networking and fusion. Myriocin treatment preserved mitochondrial volume and surface area, reduced sphericity, and increased both the number of branches and branch junctions, reflecting maintenance of a more interconnected mitochondrial network.

      Additionally, we verified that the ROI boxes shown in the zoomed panels are consistent in both size and shape across groups, as requested. We have now introduced this observation in the Methods section of the revised manuscript.

      __ ____Reviewer #1 comment: __Figure 3A: The claim that one group contains smaller mitochondria is not convincing. Both small and elongated mitochondria appear in each group. Moreover, it is unclear whether these minor differences are of any physiological relevance or whether they drive phenotypes.

      Response: We respectfully disagree with this observation and would like to clarify a few points.

      1. We have already demonstrated a statistically significant difference in mitochondrial length between E4bp4-OE and control groups (Figures 3B and 3C). This was based on a random, unbiased analysis, which consistently confirmed longer mitochondria in E4bp4-OE compared with control.

      Some degree of variability in mitochondrial length is expected in electron microscopy analyses, particularly because mitochondria from multiple cell types within iBAT are captured. It is important to note that the protective action of E4bp4 against mitochondrial fragmentation occurs specifically in brown adipocytes, where the transgene is expressed under the control of the adiponectin promoter.

      To address the potential confounding heterogeneity of iBAT mitochondria, we performed complementary cell-autonomous analyses in vitro, allowing us to directly compare mitochondrial dynamics in E4bp4-OE versus control brown adipocytes. This analysis further confirmed that E4bp4-OE prevents lipid overload - induced mitochondrial fragmentation in brown adipocytes.

      Finally, we emphasize that several studies have demonstrated that changes in mitochondrial dynamics, particularly under high-fat diet conditions, disrupt systemic energy homeostasis (DOI: 10.1016/j.cmet.2017.05.010; DOI: 10.1016/j.cell.2019.05.008; DOI: 10.1038/s42255-024-00978-0). Therefore, the differences we report are biologically meaningful in the broader context of mitochondrial dynamics and metabolic disease.


      __Reviewer #1 comment: __Figure 3E: The claim that confocal microscopy reveals palmitate-induced mitochondrial fragmentation is difficult to discern. The images lack clear morphological differences.

      __ _Response: _We thank the reviewer for this observation. To improve the interpretation of these results, we have now included a quantitative analysis of mitochondrial morphology parameters associated with Figure 3E.__ Specifically, we measured:

      • __Mitochondrial volume (µm³): __reflecting overall mitochondrial size.
      • __Surface area (µm²): __reflecting membrane expansion.
      • __Sphericity index: __indicating morphological rounding, which increases with fragmentation.
      • Number of branches and branch junctions per mitochondrion: __reflecting mitochondrial networking and fusion. __ __As shown in the new analysis (Figure S4A)__, palmitate treatment reduced mitochondrial volume, surface area, branches, and branch junctions, while increasing sphericity, consistent with a more fragmented phenotype in control cells. In contrast, these effects were significantly attenuated in E4bp4-OE cells, supporting our conclusion that E4BP4 overexpression protects against lipid overload-induced mitochondrial fragmentation. This text was added in the Results section of the revised manuscript.

      We believe this additional analysis strengthens the robustness of our findings and provides clear quantitative evidence for the morphological changes that were less apparent from qualitative image inspection alone.

      __Reviewer #1 comment: __Figure 3G: Dendra2-labeled mitochondria appear unaffected by palmitate, raising concern about the robustness of the effect across readouts.

      __ _Response: _We respectfully disagree with this observation. As shown in Figure 3G__ (bar graphs), palmitate-treated brown adipocytes exhibited a clear reduction in mitochondrial co-localization, which reflects lower levels of fused mitochondria, in the control group compared with E4bp4-OE. Importantly, no difference in mitochondrial co-localization was observed between the two groups under vehicle-treated conditions. This indicates that E4bp4 overexpression does not promote mitochondrial fusion per se, but rather prevents lipid overload - induced mitochondrial fragmentation. We also note that the representative images presented in Figure 3G are single snapshots taken from a time-lapse assay of mitochondrial dynamics. To further illustrate this effect, we direct the reviewer to the supplementary video accompanying this experiment, which clearly demonstrates the differences in mitochondrial behavior over time.

      __ ____Reviewer #1 comment: __Figure 5H: Were E4BP4 expression levels equivalent between WT and mutant cells? Quantification should be shown. Figure 5H: Were E4BP4 expression levels equivalent between WT and mutant cells? Quantification should be shown.

      Response: __We thank the reviewer for this important point. We have now added the quantification of E4bp4 mRNA levels in cells transduced with either the non-mutated vector (control) and the vector carrying a mutation in the E4bp4 DNA-binding domain (Figure S5)__. The data show no significant difference in E4bp4 expression between the two groups.

      __Reviewer #2 comment: __The evidence of mitochondrial fragmentation is not convincing. In the reviewer's opinion, Figures 3E, 3G, 4F, and 5M demonstrated a decrease in mitochondrial quantity, but not fragmentation.

      Response: __We thank the reviewer for this observation. We have already addressed the comments from reviewer #1 (above) regarding Figures 3E, 3G and 4F related to measurements of mitochondria fragmentation. To strengthen the interpretation of these results, we have also performed a quantitative analysis of mitochondrial morphology parameters associated with __Figure 5M. Specifically, we measured:

      • __Mitochondrial volume (µm³): __reflecting overall mitochondrial size.
      • __Surface area (µm²): __reflecting membrane expansion.
      • __Sphericity index: __indicating morphological rounding, which increases with fragmentation.
      • Number of branches and branch junctions per mitochondrion: __reflecting mitochondrial networking and fusion. As shown in the new analysis (Figure S4C), palmitate treatment significantly reduced mitochondrial volume, surface area, and branching, while increasing sphericity, consistent with enhanced mitochondrial fragmentation in control cells. Notably, these changes were significantly blunted in the Cers6 enhancer edited cells (EΔ), supporting our conclusion that disruption of Cers6 protects against lipid overload-induced mitochondrial fragmentation. __This text was added in the Results section of the revised manuscript.

      Regarding the reviewer's understanding of a "decrease in mitochondrial quantity, but not fragmentation," we respectfully disagree. The analyses performed for Figures 3E, 3G, 4F, and 5M clearly demonstrate that E4bp4 overexpression (E4bp4-OE) prevents lipid overload -induced mitochondrial fragmentation.

      In relation to mitochondrial quantity, our data do not support differences in mitochondrial biogenesis between groups. Specifically, the expression of thermogenic and mitochondrial biogenesis genes (Figure S2G) as well as the mitochondrial-to-nuclear DNA ratio (Figure S3D) showed no significant changes, indicating that mitochondrial biogenesis is not altered.

      Alternatively, it is possible that E4bp4 prevents mitophagy, as our results (Figure 3H) show that E4bp4-OE protects against lipid overload-induced mitochondrial depolarization. In this regard, previous studies have demonstrated that fragmented and depolarized mitochondria are targeted for degradation through mitophagy (DOI: 10.2337/db07-1781; DOI: 10.1074/jbc.M111.242412). While this explanation is consistent with our findings, we acknowledge that it remains speculative at this stage and, although interesting, is beyond the scope of the current study.

      __Reviewer #2 comment: __It is confusing whether the association shown in Figure 1C is a positive or an inverse association.

      Response: __We thank the reviewer for pointing out this source of confusion. __Figure 1C represents common variant associations for E4BP4, where the y-axis indicates the strength of association (-log10 p-value) rather than the direction (positive or inverse) of the effect. We have clarified this in the revised manuscript to avoid misinterpretation. The associations indicate that genetic variants in E4bp4 are positively linked with anthropometric traits such as weight, BMI, and waist-hip ratio.

      5. Description of analyses that authors prefer not to carry out

      *Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. *

      This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *Reviewer #2 comment:

      It would be worthwhile to investigate whether in vivo knockdown of E4BP4 blunts the Cers6-suppressing effects of PRDM16-OE.

      Response: We agree that assessing in vivo loss-of-function of E4bp4 in the context of Prdm16 overexpression would be highly informative. At present, this experiment is technically not feasible, as it would require the generation and characterization of complex in vivo models beyond the scope of the current study. Nevertheless, we are actively considering this as a future direction. In the meantime, we believe that the in vitro experiments in brown adipocytes provided here are sufficient to establish the mechanistic relationship between E4BP4 and PRDM16 in the regulation of Cers6 expression.

      __Reviewer #2 comment: __Whether E4BP4-OE affects cold tolerance in mice is now shown.

      __Response: __We thank the reviewer for this thoughtful comment. In our study, we performed an iBAT-specific E4bp4 gain-of-function assay because we observed a downregulation of E4bp4 expression in the context of obesity. The rationale for this approach was to rescue E4bp4 expression in iBAT and thereby evaluate its systemic and mechanistic effects under obesogenic conditions. We recognize that a gain-of-function assay during cold challenge would further enhance E4bp4 expression and, while interesting, this would more directly address the role of E4bp4 in thermogenic regulation rather than in obesity-related metabolic dysfunction. For this reason, we believe that a detailed investigation of E4bp4 in cold-induced thermogenesis is an important but separate question that lies beyond the scope of the current study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: The manuscript by Valdivieso-Rivera et al. investigated the role of a transcription factor, E4BP4, in brown fat functions. Using in vivo AAV gain-of-function studies, in vitro primary cultured brown adipocytes, and transcription regulation studies, authors identified that E4BP4 works together with PRDM16 to suppress Cers6 transcriptions and its derived ceramide C16:0 production. The resulted decreasing C16:0 prevents diet-induced mitochondrial fragmentation within brown adipocytes, thereby promoting brown fat functions. Overall, this study employed state-of-the-art methodologies and the collected evidence generally supported the conclusion. However, there are issues remaining to be addressed.

      Major Comments:

      1. The evidence of mitochondrial fragmentation is not convincing. In the reviewer's opinion, Figures 3E, 3G, 4F, and 5M demonstrated a decrease in mitochondrial quantity, but not fragmentation.
      2. Whether E4BP4-OE affects cold tolerance in mice is now shown.
      3. A key experiment is missing: does adding C16:0 block the mitochondrial benefits of E4BP4-OE?
      4. Whether PRDM16-OE mimics the effects of E4BP4 to induce p-Drp1 is not shown.

      Minor points:

      1. It is confusing whether the association shown in Figure 1C is a positive or an inverse association.
      2. Results from the PRDM16-OE model were mostly obtained in cultured brown adipocytes. It would be worthwhile to investigate whether in vivo knockdown of E4BP4 blunts the Cers6-suppressing effects of PRDM16-OE.

      Cross-commenting

      Reviewer #1's comments are all solid, and I agree with all of them.

      Significance

      Key strengths include state-of-the-art methodologies and detailed mechanistic studies. Key limitations include some unconvincing staining data, lack of key "rescue" experiments, and less novelty in molecular mechanisms (the ceramide-Drp1 pathway).

      Overall, this study uncovers a critical role of E4BP4 in maintaining brown adipocyte mitochondrial integrity and function, advancing our understanding of TFs in brown fat biology. This study well fits readers' interests in the adipose biology and metabolism field.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary of the key results:

      Valdivieso-Rivera and colleagues present a novel regulatory mechanism by which E4BP4 modulates C16:0 ceramide production in brown adipocytes. Several points warrant clarification or additional data.

      Suggested improvements:

      1) Figure 1F: There is an unexpected dip in gene expression at cold exposure days 3 and 7, followed by a rebound at day 14. Is this fluctuation biologically meaningful or technical?

      2) Figures B: Sample size of EE experiments is too low to draw any meaningful conclusions or to know for certain if the data are reproducible. Small sample sizes, likely coming from one litter and one batch of AAV are prone to type I error.

      3) Figures 2H and 2I (GTT): How was the AUC calculated? The GTT and ITT curves appear largely parallel aside from fasting differences. If total AUC was used instead of incremental AUC, it may overstate group differences. The recommended method is outlined in [DOI: 10.1038/s42255-021-00414-7]. Also, since insulin's half-life is ~10 minutes, later differences in the ITT curve likely reflect counterregulatory responses driven by hepatic gluconeogenesis.

      4) Figure 3I: Why do cells (none of the groups) show no response to NE stimulation? Please clarify or provide potential mechanistic insight. Perhaps the cells were not differentiated well.

      5) Figure 4F: How was mitochondrial fragmentation quantified? Please ensure that the ROI boxes shown in zoomed panels match the same region in size and shape - this applies throughout the manuscript.

      5) Figures 3I vs 5N: There is a striking discrepancy between these panels. In both, cells were treated with palmitate for 6 h, yet the NE and CCCP responses differ significantly. Are these the same cell types and conditions? Please reconcile the differences.

      6) Figure 3A: The claim that one group contains smaller mitochondria is not convincing. Both small and elongated mitochondria appear in each group. Moreover, it is unclear whether these minor differences are of any physiological relevance or whether they drive phenotypes.

      7) Figure 3E: The claim that confocal microscopy reveals palmitate-induced mitochondrial fragmentation is difficult to discern. The images lack clear morphological differences.

      8) Figure 3G: Dendra2-labeled mitochondria appear unaffected by palmitate, raising concern about the robustness of the effect across readouts.

      9) Figure 5H: Were E4BP4 expression levels equivalent between WT and mutant cells? Quantification should be shown. Figure 5H: Were E4BP4 expression levels equivalent between WT and mutant cells? Quantification should be shown.

      Cross-commenting

      I agree with R2's points

      Significance

      This advance is incremental for the basic science community.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03098

      Corresponding author: Pedro Escoll

      1. General Statements

      Our study investigates the interplay between the metabolism of host cells and the intracellular replication of Salmonella enterica serovar Typhimurium (ST). Type III Secretion Systems (T3SSs) are considered essential for ST to replicate within macrophages. However, we found that restricting macrophages to different bioenergetic contexts, such as supplementing them with glycerol, modulates bacterial replication and remarkably, enables a T3SS-deficient ST mutant (ΔprgHssaV) to replicate intracellularly. This T3SS-independent replication occurs within the Salmonella-containing vacuole (SCV) and is driven by the capacity of the host cell to provide these preferred nutrients, rather than by the host glycolytic activity itself.

      2. Description of the planned revisions

      __Reviewer #1 (Evidence, reproducibility and clarity): __

      Summary:

      In this manuscript, the authors investigate how host cell metabolic heterogeneity influences the intracellular replication of Salmonella enterica serovar Typhimurium. They use live-cell imaging of infected human primary macrophages to reveal that bacterial replication does not occur uniformly across infected cells. They demonstrate that supplementation with specific carbon sources-used by Salmonella during infection-promotes bacterial replication and increases the proportion of macrophages supporting intracellular growth. These effects are seen even in the absence of functional Type III Secretion Systems (T3SS), using a ΔprgHssaV double mutant. The authors further suggest that this replication enhancement is not strictly dependent on host glycolytic activity but rather on the host cell's ability to import nutrients. Their findings imply that intracellular Salmonella can exploit host cell metabolism to grow, even without its canonical virulence secretion systems, under nutrient-favorable conditions.

      Major Concern:

      While the topic is potentially interesting, the novelty is not fully clear. The concept that nutrient availability impacts intracellular Salmonella replication, largely via T3SS2 function, has been addressed previously (e.g., Liss et al., 2017). The finding that added exogenous carbon sources can enhance bacterial growth is thus not unexpected. The key claim-that Salmonella can replicate intracellularly even in the absence of T3SS function-would be significantly strengthened by demonstrating whether this is specific to Salmonella, or whether similar effects are seen with non-intracellular organisms such as E. coli K-12. If the phenomenon is unique to Salmonella, this would suggest a pathogen-specific mechanism beyond general metabolic support.

      As acknowledged by the Reviewer, the novelty and key claim of our work is that Salmonella can replicate intracellularly even in the absence of T3SS. To experimentally sustain that claim, we showed evidence that providing macrophages with the preferred carbon sources used by Salmonella during infection, such as glycerol, bypass the requirement of both T3SS by Salmonella to grow, intravacuolarly, inside macrophages.

      With respect to the article mentioned by the Reviewer (Liss et al. 2017, ref 36 in the manuscript), there are three important novel insights provided by our work: i) we show that Salmonella can replicate intracellularly in the SCV even in the absence of T3SS if certain carbon sources are provided; ii) we show the preference of Salmonella for certain carbon sources intracellularly such as glycerol and galactose (but not preferentially glucose); and iii) we have extended our observations to primary human macrophages in addition to RAW cells.

      We are not convinced that the experiment suggested by the Reviewer to use E. coli K12 (ECK12) is necessary to support our findings for Salmonella, but we propose to add the requested experiment. Briefly, we will infect hMDMs and RAW macrophages with ST-WT-GFP, ST-ΔprgHΔssaV or ECK12-WT-GFP, while culturing macrophages on different carbon sources (glucose, glycerol, galactose, fructose). Then we will monitor intracellular bacterial growth. By comparing bacterial growth of ST double mutant with ECK12-WT-GFP under favorable carbon sources such as glycerol, the results will be definitive to answer whether this phenomenon is unique to Salmonella or not.

      Specific Comments:

      1. Figure 1H: The effect shown here is not compelling due to inconsistent y-axis scaling. Panels 1B, 1C, and 1D should use a unified axis range with 1H to allow direct visual comparison of growth dynamics.

      Thank you, we will change it as suggested.

      Figures 1B, 1C, 1G, 1H: The current presentation of individual growth traces makes it difficult to appreciate the population-level trend. A smoothed average line overlaid on these plots could better represent the average dynamics of replicative vs. non-replicative infections. Or alternatively the total fraction of cells that proliferate summarized as a segmented bar plot (possibly binned per time point).

      We will plot the results as suggested, the total fraction of infected cells harboring bacteria that proliferate as a segmented bar plot, binned per time point.

      Figure 2G: This panel would benefit from including a comparable condition with the SPI-1/SPI-2 double mutant to aid interpretation. Additionally, the authors should explore whether this nutrient-supported replication is seen in non-phagocytic cells such as HeLa or Caco-2, which would help delineate whether the observed phenomenon is macrophage-specific.

      The graph asked by Reviewer is Figure S1D. As we are representing ST growth in macrophages supporting Salmonella replication, some of the conditions, such as lactate, cannot be shown in the infection conditions using the double mutant because there are no cells supporting the replication of the double mutant, so there are no cells to plot.

      As suggested, we are also going to perform the same experiments in HeLa cells to investigate whether the observed phenomenon is macrophage specific.

      Line 117: The sentence stating that the double mutant can undergo "exponential intracellular growth even in the absence of T3SS-dependent secretion" is an overstatement. The data suggest only a modest improvement in growth, restricted to a minority of infected cells. This claim should be revised accordingly, as should similar overstatements in the discussion (e.g., lines 203-204).

      We will remove the term 'exponential' and revise the sentence at line 117 and those in the discussion. Line 203-204 will be: 'we demonstrated that providing macrophages with preferred nutrients allows a subpopulation of ST to replicate intracellularly without the need for a functional T3SS'.

      Line 162: The authors should clarify that glycerol had the strongest effect in primary macrophages, while multiple alternative carbon sources had notable effects primarily in RAW cells.

      We will add this clarification in the text.

      Lines 198-201: This relates to the major concern. The authors should assess whether the observed growth enhancement is unique to Salmonella by testing other bacteria not known for intracellular replication. This would clarify whether the effect is due to general nutrient-driven host cell permissivity or a pathogen-specific adaptation.

      As outlined above, we will perform the suggested experiment with E. coli K12 to answer whether this phenomenon is unique to Salmonella or not.

      RAW 264.7 Observations: The modest intracellular growth of SPI-1/SPI-2 double mutants in RAW cells is consistent with prior observations in the field. The idea that nutrient availability explains this is noteworthy. The authors might consider whether differences in standard culture media (e.g., glucose concentration) influence these outcomes. This could have broader implications for reproducibility in infection models.

      Thank you for the suggestion, we will include a paragraph discussing whether differences in standard culture media might influence bacterial replication. Indeed, to answer also a question from Reviewer #2, we will include a new supplementary Figure where we have already compared "no Glucose" (0 mM), "low Glucose" (2 mM) and standard culture media Glucose levels (10 mM). Our results show that differences in Glucose levels in the culture media influence Salmonella intracellular growth in hMDMs and RAW macrophages (see Figure below).

      Reviewer #1 (Significance):

      This manuscript highlights how host cell metabolism and nutrient availability can influence intracellular Salmonella replication. While the findings are intriguing, the current framing overstates their novelty and impact. Key revisions-such as comparative experiments with non-pathogenic bacteria and non-phagocytic cells, consistent figure scaling, and more measured language-would improve the clarity and significance of the work. If the authors can show Salmonella-specific mechanisms at play, the study could offer important insights into host-pathogen metabolic interactions.

      We believe that performing all experiments suggested by the Reviewers, as well as the requested changes in the text to avoid overstatements, will improve the manuscript and will offer readers new insights and details to better understand the metabolic interactions happening between host and pathogens and how they can shape bacterial virulence.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary: In their study titled "Provision of Preferred Nutrients to Macrophages Enables Salmonella to Replicate Intracellularly Without Relying on Type III Secretion Systems", Dr. Garcia-Rodriguez et al. describe the influence of the host cell metabolism on the intracellular proliferation potential of Salmonella during infection. The authors investigate whether the supplementation of the media with different carbon sources has an impact on the intracellular lifestyle of Salmonella. By using single cell tracking in live-cell microscopy, including the use of different reporter strains, they describe that glycerol benefits Salmonella's ability to grow within its vacuolar niche, in part, interestingly, in a Type-3-Secretion System independent manner.

      They furthermore highlight the dependence on host background for this observation by showing that effects differ between cells of varying metabolic activity. Throughout their study, they use cutting-edge methodologies, as well as Salmonella strains that could be of versatile use in other investigations. This work, while limited to in vitro models for now, has implications for the better understanding of how pathogens and their host are intertwined. This, in turn, has significance for the development of new anti-infective strategies further down the line. I therefore believe that it should be disseminated to the research community. The following comments summarize ideas how the quality of the study could be improved:

      Major comments:

      1. Salmonella, especially when cultured to activate the SPI-1 T3SS, introduce rapid cell death in their host - most commonly through activation of the NLRC4 inflammasome and downstream pyroptotic signaling. The authors don't describe the effect of the infection in differently supplemented media on host cell death, yet it would be important to elucidate whether this cellular response is also altered.

      We have performed these experiments and tracked host cell death by measuring Annexin-V levels in single cells, during infection in the conditions using the different supplements. We will include these results in the revised version of the manuscript and main text. Please see the Figure below showing that the different carbon sources did not affect macrophages cell death significantly (future Figure S1E and S1F)

      The aspect of partially T3SS-independent growth enhancement by glycerol (and depending on the host background glucose) is most curious. The authors quantify this by determining the percentage of cells containing proliferating Salmonella and by tracking individual cells over the time course of the infection. I am missing a general statement on whether the initial infection rate (i.e. timepoint 0) is comparable across conditions and mutants, and whether possible discrepancies in the infection rate could have downstream effects on the statements and claims made in the manuscript. This is, to my mind, also important for the quantification of cytosolic and vacuolar bacteria. There, the authors always speak in "percent of infected cells", so it is relevant whether the number of infected cells varies among conditions (see e.g. Figure 3).

      We thank the reviewer for this comment. The initial infection rate at t=0 significantly differs between WT and mutants in RAW 264.7 macrophages, and carbon source supplementation has no effect. However, as we only analyze infected cells, this does not affect the final results. In any case, we are going to add the graphs of % of infected cells at t=0 as supplementary Figures S1G-K.

      The authors use a concentration of 10mM for all supplemented alternative carbon sources. It would be useful to discuss the rationale behind this approach, including whether all chemicals have the same ability to be taken up by the cell. A concentration series (at least for some of the tested compounds) may be beneficial to bolster the conclusions that the authors make.

      We use 10 mM as this is the concentration of Glucose in standard culture media. By using 10 mM for all the different carbon sources, we can thus compare them keeping concentration constant (10 mM). Indeed, to answer also Reviewer #1, we will include in the manuscript a paragraph discussing whether differences in standard culture media might influence bacterial replication. As this Reviewer suggested, we will include a new supplementary Figure comparing no Glucose (0 mM), low Glucose (2 mM) and standard culture media Glucose levels (10 mM), showing that the concentration of glucose has a gradual effect in supporting the replication of the T3SS-deficient strain in RAW macrophages (see Figure below).

      I think it would strengthen the study, if the authors used host cell mutants in certain metabolite transporters, or alternatively Salmonella mutants that are deficient in uptake or metabolism of some of the compounds used in this study. This point is alluded to in the discussion, and I believe if the authors could show that in certain host mutant backgrounds the impact of supplementation with alternative carbon sources can be reversed, it would immensely bolster the strength of the claims.

      Following Reviewer's suggestion, we generated ST metabolic mutants unable to metabolize glycerol, galactose or fructose. As seen in the Figures below, during infection, the supplementations with glycerol/galactose does not boost Salmonella replication in metabolic mutants as in WT conditions, demonstrating that supplemented carbon sources indeed arrive to bacteria within the SCV and are used by intracellular Salmonella to grow. This Figures will be now Future Figure 4J-N.

      I think it would be useful to include the meaning of this work for other intracellular pathogens in the discussion section: Do the authors believe that this phenotype is Salmonella-specific? If the pathogens are at hand, it might be interesting to infect with other intracellular bacteria, such as Shigella or Francisella to investigate if the boosting of growth by glycerol also holds true for these.

      We have performed experiments with Legionella pneumophila and galactose (see figure below), showing that this carbon source is specific of Salmonella (as shown in Figure 4F in the manuscript). We could perform experiments also with L. pneumophila and glycerol to answer the Reviewers question. However, we think that the results with Legionella might be out of the focus of this article and would constitute themselves a new article, as both pathogens have a very different, non-comparable intracellular metabolism. Thus, the experiment suggested by Reviewer #1 using E. coli K12 (ECK12) while culturing macrophages on different carbon sources (glucose, glycerol, galactose, fructose) is in our opinion a better fit. We will monitor intracellular bacterial growth and, by comparing bacterial growth of the ST-ΔprgHssaV double mutant with ECK12-WT-GFP under favorable carbon sources such as glycerol, the results will be definitive to answer whether this phenomenon is unique to Salmonella or not.

      Minor comments:

      • Line 41: The authors write "are required for", but given their findings, it might be more accurate to phrase this as "have previously been described to be required for" or "have previously been described essential for".

      We will change it.

      • Line 86: Is the referencing of Figure S1C correct or should it be S1A?

      Yes, thank you, it is S1A, we will change it.

      • Lines 119,120: Related to what is displayed in Figure 2G: Are these differences significant?

      Glucose, galactose and lactate curves are significantly different compared to control (p

      • Lines 126,127: What is the change for glycerol, and is the intracellular growth significantly higher compared to the control?

      6,2 {plus minus} 1.9% in glycerol vs. 2 {plus minus} 1% in control, p

      • Figure 1E&F: Related to one of the major comments: Would it be possible to quantify this at timepoint 0 to ensure that the initial infection rates are the same across conditions?

      As outlined above, we will add the graphs of % of infected cells at t=0 as supplementary Figures S1G-K (Major Comment number 2 from this Reviewer)

      • Figure 3E,F: Why does the sum of the curves not add up to 100% (especially in the beginning)? And related to that, why do both the percentage of cytosolic and vacuolar cells grow over time? Since this infection is performed with gentamycin present, re-infection should not be possible.

      The localization module of the SINA plasmid relies on transcriptional reporters, whose expression requires time for induction and detection. Therefore, at early time points, infected cells are not classified as vacuolar or cytoplasmic because the reporters have not yet been expressed (as described in PLoS Pathog. 2021;17(4):e1009550, PMID: 33930101).

      At later time points, a subset of cells harbors bacteria that do not express any of the reporters. These bacteria are considered dormant, representing about 10% of the population, as detailed in the same article. In addition, a small percentage of infected cells simultaneously contain both STvac and STcyt. Such cells are subclassified as harboring STcyt but also STvac. Consequently, the total proportion of infected cells carrying STvac and STcyt may also exceed 100%.

      • Figure S1A: While significance testing is described in the legend, there are no indications of significance in the figure panels.

      The Reviewer is right, there is no significant changes between conditions, we will change the significance testing to ns=non-significant.

      • Figure S1B: Due to the stark discrepancies between hMDMs and RAW264.7, it might make sense to plot them on two different y-axes. Furthermore, I would clarify the y-axis: In the legend, it seems as CFU counts are shown, while CFU/ml/t2 rather describes a change over time.

      We agree. However, we will maintain the scale of the Y-axis as it was required by Reviewer #1 to be consistent with Y-axis. We will change the legend to indicate that we plot CFU/ml/t2.

      • Figure S1C: The prgH-mutant seems to outperform the wildtype in intracellular proliferation, while the double mutant underperforms compared to the ssaV-mutant. Could you please discuss/explain how the prgH-deletion has seemingly opposite effects on intracellular proliferation, depending on whether it is introduced in a wildtype or ssaV-KO background?

      As T3SS-1 plays a role in inducing macrophage cell death via activation of the NLRC4 inflammasome, macrophages infected with bacteria carrying a functional T3SS-1 (such as WT), are more prone to undergo cell death at late time-points, which disrupts bacterial proliferation and reduces the proportion of infected cells. Thus, these dead cells were not considered in the analysis. Even if cell death of ST-WT-infected RAW macrophages remains below 5%, more ΔprgH-infected cells are considered in the analyses at late time-points, and ST-ΔprgH continue replicating (and growing in ST area).

      • Figure S2A: As for the comments related to Figure 3, I am unsure how the sum of STvac and STcyt can deviate from 100. This is especially puzzling for the red curve (glycerol) at e.g. 3hpi, when the sum of the two clearly seems to be larger than 100.

      At early time points, no infected cells are classified as vacuolar or cytoplasmic because the reporters have not yet been expressed. At later time points, a subset of cells harbor bacteria that do not express any of the reporters, which are considered dormant (10% of the population). Finally, a small percentage of infected cells simultaneously contain both STvac and STcyt, therefore the total proportion of infected cells carrying STvac and STcyt may also exceed 100%.

      **Cross-commenting** I agree in principle with the comments raised by Reviewer #1 - especially when it comes to the enhancement in significance if the authors assess the species specificity. Elucidating whether the growth enhancement is Salmonella-specific, occurs for other intracellular pathogens (e.g. Shigella, Francisella) or also for extracellular bacteria (e.g. E. coli, Yersinia), would definitely strengthen the study.

      As said before, for the revision we are going to perform the experiments suggested by Reviewer #1 of using E. coli K12 (ECK12) while culturing macrophages on different carbon sources (glucose, glycerol, galactose, fructose). And to satisfy this Reviewer's curiosity, we are going to perform experiments also with L. pneumophila and glycerol.

      Reviewer #2 (Significance):

      General assessment:

      As the authors write in their discussion, the strength of this study is also it's limitation: Using single cell tracking in microscopy is a very elegant and powerful approach, yet conversely, it limits the scope of the study to in vitro approaches. While it enables assessment of bacterial pathogenicity and host-dependence on a single-cell level, it remains to be investigated whether the conclusion that the authors draw from their work will hold in more complex or physiologically relevant models.

      During the preparation of this Revision Plan, we discovered the article published in PLoS Pathogens by Andrew Grant and Pietro Mastroni "Attenuated Salmonella Typhimurium Lacking the Pathogenicity Island-2 Type 3 Secretion System Grow to High Bacterial Numbers inside Phagocytes in Mice" (PLoS Pathog 2012 8(12): e1003070, PMID: 23236281). In this article, authors showed that our main conclusion is also relevant in vivo (Salmonella Typhimurium can replicate within macrophages in the absence of T3SS). This will be addressed in the Discussion of the revised manuscript. Our study provides a metabolic explanation, at the single cell level for those observations.

      A further small shortcoming of the study is the heavy focus on the bacterial aspect in this host-pathogen interaction. While the authors do link the proliferative potential of the intracellular bacteria to the metabolic status of the individual host cell, more could be done with respect to host responses in the varying media compositions, including investigating alterations to the cell cycle, induction of cell death, or the ability to activate inflammatory signaling.

      We agree, and we are actively investigating how restricting macrophages to specific carbon sources impact other host responses, such as cytokine production. For the revised manuscript, we will add the results on the induction of cell death.

      Nonetheless, this study is of large interest to the field and the systematic approach to addressing their hypotheses speaks to the scientific excellence of the investigators.

      Thank you.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      N/A

      • *

      4. Description of analyses that authors prefer not to carry out

      N/A

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In their study titled "Provision of Preferred Nutrients to Macrophages Enables Salmonella to Replicate Intracellularly Without Relying on Type III Secretion Systems", Dr. Garcia-Rodriguez et al. describe the influence of the host cell metabolism on the intracellular proliferation potential of Salmonella during infection. The authors investigate whether the supplementation of the media with different carbon sources has an impact on the intracellular lifestyle of Salmonella. By using single cell tracking in live-cell microscopy, including the use of different reporter strains, they describe that glycerol benefits Salmonella's ability to grow within its vacuolar niche, in part, interestingly, in a Type-3-Secretion System independent manner.

      They furthermore highlight the dependence on host background for this observation by showing that effects differ between cells of varying metabolic activity. Throughout their study, they use cutting-edge methodologies, as well as Salmonella strains that could be of versatile use in other investigations. This work, while limited to in vitro models for now, has implications for the better understanding of how pathogens and their host are intertwined. This, in turn, has significance for the development of new anti-infective strategies further down the line. I therefore believe that it should be disseminated to the research community. The following comments summarize ideas how the quality of the study could be improved:

      Major comments:

      1. Salmonella, especially when cultured to activate the SPI-1 T3SS, introduce rapid cell death in their host - most commonly through activation of the NLRC4 inflammasome and downstream pyroptotic signaling. The authors don't describe the effect of the infection in differently supplemented media on host cell death, yet it would be important to elucidate whether this cellular response is also altered.
      2. The aspect of partially T3SS-independent growth enhancement by glycerol (and depending on the host background glucose) is most curious. The authors quantify this by determining the percentage of cells containing proliferating Salmonella and by tracking individual cells over the time course of the infection. I am missing a general statement on whether the initial infection rate (i.e. timepoint 0) is comparable across conditions and mutants, and whether possible discrepancies in the infection rate could have downstream effects on the statements and claims made in the manuscript. This is, to my mind, also important for the quantification of cytosolic and vacuolar bacteria. There, the authors always speak in "percent of infected cells", so it is relevant whether the number of infected cells varies among conditions (see e.g. Figure 3).
      3. The authors use a concentration of 10mM for all supplemented alternative carbon sources. It would be useful to discuss the rationale behind this approach, including whether all chemicals have the same ability to be taken up by the cell. A concentration series (at least for some of the tested compounds) may be beneficial to bolster the conclusions that the authors make.
      4. I think it would strengthen the study, if the authors used host cell mutants in certain metabolite transporters, or alternatively Salmonella mutants that are deficient in uptake or metabolism of some of the compounds used in this study. This point is alluded to in the discussion, and I believe if the authors could show that in certain host mutant backgrounds the impact of supplementation with alternative carbon sources can be reversed, it would immensely bolster the strength of the claims.
      5. I think it would be useful to include the meaning of this work for other intracellular pathogens in the discussion section: Do the authors believe that this phenotype is Salmonella-specific? If the pathogens are at hand, it might be interesting to infect with other intracellular bacteria, such as Shigella or Francisella to investigate if the boosting of growth by glycerol also holds true for these.

      Minor comments:

      • Line 41: The authors write „are required for", but given their findings, it might be more accurate to phrase this as „have previously been described to be required for" or „have previously been described essential for".
      • Line 86: Is the referencing of Figure S1C correct or should it be S1A?
      • Lines 119,120: Related to what is displayed in Figure 2G: Are these differences significant?
      • Lines 126,127: What is the change for glycerol, and is the intracellular growth significantly higher compared to the control?
      • Figure 1E&F: Related to one of the major comments: Would it be possible to quantify this at timepoint 0 to ensure that the initial infection rates are the same across conditions?
      • Figure 3E,F: Why does the sum of the curves not add up to 100% (especially in the beginning)? And related to that, why do both the percentage of cytosolic and vacuolar cells grow over time? Since this infection is performed with gentamycin present, re-infection should not be possible.
      • Figure S1A: While significance testing is described in the legend, there are no indications of significance in the figure panels.
      • Figure S1B: Due to the stark discrepancies between hMDMs and RAW264.7, it might make sense to plot them on two different y-axes. Furthermore, I would clarify the y-axis: In the legend, it seems as CFU counts are shown, while CFU/ml/t2 rather describes a change over time.
      • Figure S1C: The prgH-mutant seems to outperform the wildtype in intracellular proliferation, while the double mutant underperforms compared to the ssaV-mutant. Could you please discuss / explain how the prgH-deletion has seemingly opposite effects on intracellular proliferation, depending on whether it is introduced in a wildtype or ssaV-KO background?
      • Figure S2A: As for the comments related to Figure 3, I am unsure how the sum of STvac and STcyt can deviate from 100. This is especially puzzling for the red curve (glycerol) at e.g. 3hpi, when the sum of the two clearly seems to be larger than 100.

      Cross-commenting

      I agree in principle with the comments raised by Reviewer #1 - especially when it comes to the enhancement in significance if the authors assess the species specificity. Elucidating whether the growth enhancement is Salmonella-specific, occurs for other intracellular pathogens (e.g. Shigella, Francisella) or also for extracellular bacteria (e.g. E. coli, Yersinia), would definitely strengthen the study.

      Significance

      General assessment:

      As the authors write in their discussion, the strength of this study is also it's limitation: Using single cell tracking in microscopy is a very elegant and powerful approach, yet conversely, it limits the scope of the study to in vitro approaches. While it enables assessment of bacterial pathogenicity and host-dependence on a single-cell level, it remains to be investigated whether the conclusion that the authors draw from their work will hold in more complex or physiologically relevant models.

      A further small shortcoming of the study is the heavy focus on the bacterial aspect in this host-pathogen interaction. While the authors do link the proliferative potential of the intracellular bacteria to the metabolic status of the individual host cell, more could be done with respect to host responses in the varying media compositions, including investigating alterations to the cell cycle, induction of cell death, or the ability to activate inflammatory signaling.

      Nonetheless, this study is of large interest to the field and the systematic approach to addressing their hypotheses speaks to the scientific excellence of the investigators.

      Advance:

      The advance this study makes is rather on the foundational than the applied side - which does not mean that conclusions drawn in this work are not of interest to a wider field. By investigating the intracellular lifestyle on a single-cell level, the authors were able to observe a striking and curious phenotype: that certain alternative carbon sources can enhance intracellular proliferation in a T3SS-independent manner. By further dissecting the reason for this observation, they create a stronger base for their conclusion in what can be described as an overall comprehensive study.

      Audience:

      As outlined in the description of the main advances, this study will be of largest interest to members of the basic research community in host-pathogen interactions. While the study so far focuses on Salmonella, a well-described and genetically accessible intracellular model pathogen, it could also be of interest to a broader community of researchers investigating bacterial pathogenicity, as well as those that are interested in the host metabolism.

      Describe your expertise:

      I have a background in bacterial pathogenicity in Salmonella infection, and have since expanded to other pathogens, as well as co-infections with viruses. In addition to investigating the pathogens, I have expertise in dissecting the host response, with a focus on innate immunity, inflammasome activation and host cell death. Overall, I am accustomed to unbiased screening approaches, which are followed by the formulation and assessment of hypotheses to unravel the molecular mechanisms underlying the host-pathogen interface.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors investigate how host cell metabolic heterogeneity influences the intracellular replication of Salmonella enterica serovar Typhimurium. They use live-cell imaging of infected human primary macrophages to reveal that bacterial replication does not occur uniformly across infected cells. They demonstrate that supplementation with specific carbon sources-used by Salmonella during infection-promotes bacterial replication and increases the proportion of macrophages supporting intracellular growth. These effects are seen even in the absence of functional Type III Secretion Systems (T3SS), using a ΔprgH/ΔssaV double mutant. The authors further suggest that this replication enhancement is not strictly dependent on host glycolytic activity but rather on the host cell's ability to import nutrients. Their findings imply that intracellular Salmonella can exploit host cell metabolism to grow, even without its canonical virulence secretion systems, under nutrient-favorable conditions.

      Major Concern:

      While the topic is potentially interesting, the novelty is not fully clear. The concept that nutrient availability impacts intracellular Salmonella replication, largely via T3SS2 function, has been addressed previously (e.g., Liss et al., 2017). The finding that added exogenous carbon sources can enhance bacterial growth is thus not unexpected. The key claim-that Salmonella can replicate intracellularly even in the absence of T3SS function-would be significantly strengthened by demonstrating whether this is specific to Salmonella, or whether similar effects are seen with non-intracellular organisms such as E. coli K-12. If the phenomenon is unique to Salmonella, this would suggest a pathogen-specific mechanism beyond general metabolic support.

      Specific Comments:

      1. Figure 1H: The effect shown here is not compelling due to inconsistent y-axis scaling. Panels 1B, 1C, and 1D should use a unified axis range with 1H to allow direct visual comparison of growth dynamics.
      2. Figures 1B, 1C, 1G, 1H: The current presentation of individual growth traces makes it difficult to appreciate the population-level trend. A smoothed average line overlaid on these plots could better represent the average dynamics of replicative vs. non-replicative infections. Or alternatively the total fraction of cells that proliferate summarized as a segmented barplot (possibly binned per time point).
      3. Figure 2G: This panel would benefit from including a comparable condition with the SPI-1/SPI-2 double mutant to aid interpretation. Additionally, the authors should explore whether this nutrient-supported replication is seen in non-phagocytic cells such as HeLa or Caco-2, which would help delineate whether the observed phenomenon is macrophage-specific.
      4. Line 117: The sentence stating that the double mutant can undergo "exponential intracellular growth even in the absence of T3SS-dependent secretion" is an overstatement. The data suggest only a modest improvement in growth, restricted to a minority of infected cells. This claim should be revised accordingly, as should similar overstatements in the discussion (e.g., lines 203-204).
      5. Line 162: The authors should clarify that glycerol had the strongest effect in primary macrophages, while multiple alternative carbon sources had notable effects primarily in RAW cells.
      6. Lines 198-201: This relates to the major concern. The authors should assess whether the observed growth enhancement is unique to Salmonella by testing other bacteria not known for intracellular replication. This would clarify whether the effect is due to general nutrient-driven host cell permissivity or a pathogen-specific adaptation.
      7. RAW 264.7 Observations: The modest intracellular growth of SPI-1/SPI-2 double mutants in RAW cells is consistent with prior observations in the field. The idea that nutrient availability explains this is noteworthy. The authors might consider whether differences in standard culture media (e.g., glucose concentration) influence these outcomes. This could have broader implications for reproducibility in infection models.

      Significance

      This manuscript highlights how host cell metabolism and nutrient availability can influence intracellular Salmonella replication. While the findings are intriguing, the current framing overstates their novelty and impact. Key revisions-such as comparative experiments with non-pathogenic bacteria and non-phagocytic cells, consistent figure scaling, and more measured language-would improve the clarity and significance of the work. If the authors can show Salmonella-specific mechanisms at play, the study could offer important insights into host-pathogen metabolic interactions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Reviewers for their kind and constructive comments. We are happy to read that the reviewers found our study methodologically robust and comprehensive in addressing the metabolic heterogeneity of endothelial cells.

      Reviewer 1, comment 1: Image quality in sprouting assays - The images presented for the sprouting assays (e.g., Figure 4) are of suboptimal resolution and quality, making it difficult to evaluate the effects of the various compounds on EC behavior. Even under control conditions, clear sprout-like structures are not readily discernible. Improved image resolution-preferably through high-quality bright-field microscopy-and the inclusion of immunofluorescence images of labeled endothelial spheroids are recommended to enhance interpretability.

      Response: We appreciate the reviewer’s concern and have revisited the sprouting assay images. Our approach is consistent with established methods in the field (Heiss et al., FASEB J, 2015), where brightfield imaging is routinely used for quantification without additional immunostaining. Hence, we believe that the brightfield images are of sufficient resolution to allow reproducible quantification of normalized total sprout length. All experiments were performed under identical imaging and analysis protocols, and thus we are confident that the quantification reflects true biological differences. We cite the reference in the revised manuscript and clarify it as well in the Methods section.

      Reviewer 1, comment 2: Validation of the quiescence model - The current approach to induce quiescence should be further substantiated. Beyond proliferation markers, additional hallmarks of quiescent cells-such as epigenetic signatures, protein quality control mechanisms, and translational activity-should be assessed to confirm that the EC subtypes achieve a bona fide resting state.

      Response: We acknowledge the value of proper phenotyping of quiescent cells. However, most studies involving quiescent (endothelial) cells rely on EdU incorporation or similar proliferation markers to confirm entry into a non-proliferative state (Kalucka et al., Cell Metabolism, 2018; Coloff et al., Cell Metabolism, 2016). In our study, we have used EdU staining and FACS analysis to establish cell cycle arrest. Moreover, we find clear proteomic patterns that support the case of a quiescent state. We have also demonstrated the reversibility of quiescence (see Suppl. Fig. 1c) via reseeding and proliferation recovery of all EC types, which is a defining functional hallmark of true quiescence. Together, the EdU, proteomic and reseeding/proliferation data provide strong evidence that our EC subtypes reach a physiologically quiescent, non-senescent state.

      Reviewer 1, comment 3: Reversibility of quiescence - It is important to demonstrate that the EC subtypes investigated can re-enter the cell cycle following release from contact inhibition. Without such evidence, the possibility remains that some of the observed metabolic features reflect a transition to senescence rather than reversible quiescence.

      Response: This is an excellent suggestion. We have included new data that shows that ECs regain proliferative capacity upon reseeding of quiescent ECs at lower confluency (Suppl. Fig. 1c). The results support the interpretation that the observed metabolic features reflect reversible quiescence rather than senescence.

      Reviewer 1, comment 4: Assessment of cell viability - While EC proliferation, migration, and sprouting were examined to infer functional roles of metabolic adaptations, analyses of cell viability and death are also necessary to evaluate potential homeostatic or survival-related functions of the observed metabolic changes.

      Response: We appreciate the Reviewer’s concern about cell viability in our experimental setup, and we agree that viability assessment is important. Using trypan blue staining and automated cell counting, we observed that >85% of ECs remained viable from day 1 through day 10 of the quiescence model and included these results in the manuscript (Suppl. Fig. 1b).

      Reviewer 1, comment 5: Validation of pharmacological findings - The pharmacological inhibition experiments are informative and constitute a central part of the study. However, given the possibility of off-target effects, key conclusions should be corroborated using alternative loss-of-function approaches, such as RNA interference (e.g., shRNA or siRNA).

      Response: We recognize the possibility of side effects for pharmacological inhibitors, but the inhibitors, including the ones that show the strongest different effects in HUVECs and iLECs (succinyl acetone and R162) in our study are well-established, selective inhibitors of glutamate dehydrogenase (Wang et al., Pharmacological Research, 2022) and δ-aminolevulinic acid dehydratase (Nauli et al., J Clin. Biochem. Nutr., 2023), respectively, and have not been reported to exhibit significant off-target activity in endothelial cells. Furthermore, the aim of our study was not to define specific mechanistic pathways, but to highlight phenotype-specific metabolic vulnerabilities in distinct endothelial states. Performing knockdown experiments would go beyond the scope and focus of this manuscript and introduce their own limitations, including off-target effects and, most importantly, timing mismatches relative to our long-term assays (e.g., sprouting assays assessed at day 3 versus transient RNAi effects lasting for only 1-2 days). We hope the Reviewer agrees that our current approach sufficiently supports the study’s conclusions.

      __Reviewer 2, comment 1: __it was not clear whether the authors worked with single donor endothelial cells or with mixed donors. This should be clarified as it is important for the statistical analyses (single donor based EC research typically uses n=4, while for the mixed donor, an n=3 is sufficient).

      Response: We thank Rreviewer 2 for highlighting that we did not include this information in the Methods section and we did so in the revised manuscript. HDBECs, HDLECs and iLECs are from single donors, HUVECs are from mixed donors. We acknowledge the reviewer’s concern about the power of statistical analyses, but we think that n=3 is sufficient with proper correction for statistical tests. Furthermore, previous in vitro studies with ECs are done with single donor cells and in biological triplicates (Wong et al., 2017; Kalucka et al., 2018; Simões-Faria et al., 2025 and more). Moreover, for sprouting assays, we have n > 3 for most conditions.

      Reviewer 2, comment 2: I would like to see a sentence on the importance of shear stress in EC behavior (metabolism) in the introduction. It was recently shown that the in vivo situation of ECs encountering wall shear stress (Faria et al, PMID: 39832080) affects the metabolic behavior switching to glutamine metabolism. This aligns with the research of the authors as well.

      Response: We thank Reviewer 2 for drawing our attention to this relevant and interesting study. We mention the study in the introduction and the discussion.

      Reviewer 2, comment 3: suggestion for the authors: it could be useful if a figure is introduced to show the "physiological" location of the 4 EC used and that a rationale is provided for this.

      Response: We have included this in Supplementary Figure 1 and in the text.

      Reviewer 2, comment 4: figures are of low quality, I found it very difficult to see the spheroid/sprouting images. This should be addressed in the final version prior publication.

      Response: The new version has higher quality sprouting images in figure 4 and 5. The images can also be found in high quality on BioStudies (Accession: S-BSST1716).

      Reviewer 2, comment 5: Fig 2 c: I'm not sure if this panel is very relevant, when looking into detail, opposite pathways are present (glycolysis - gluconeogenesis). As well, I'm not sure if galactose metabolism is truly relevant, unless the author managed to measure distinct hexose and hexose-phosphates? Given the flow injection analysis setup, I doubt this. Would suggest to move this to supplement or to simply leave it out.

      Response: The Reviewer is correct; the employed analytics cannot distinguish different hexoses and hexose-phosphates. We have moved figure 2c to supplementary figure 4c.

      Reviewer 2, comment 6: Fig 3 b: was there any statistics performed on these data to compare the different setups?

      Response: We performed statistical analyses on this data and included it in the figures and figure legends.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      By employing a proteomics and metabolomics approach the authors clarified the molecular landscape of 4 major EC types in quiescent and proliferating conditions. The study is extensive and adds novelty to the EC research

      Major comments:

      • it was not clear whether the authors worked with single donor endothelial cells or with mixed donors. This should be clarified as it is important for the statistical analyses (single donor based EC research typically uses n=4, while for the mixed donor, an n=3 is sufficient).
      • I would like to see a sentence on the importance of shear stress in EC behavior (metabolism) in the introduction. It was recently shown that the in vivo situation of ECs encountering wall shear stress (Faria et al, PMID: 39832080) affects the metabolic behavior switching to glutamine metabolism. This aligns with the research of the authors as well.
      • suggestion for the authors: it could be useful if a figure is introduced to show the "physiological" location of the 4 EC used and that a rationale is provided for this.
      • figures are of low quality, I found it very difficult to see the spheroid/sprouting images. This should be addressed in the final version prior publication.
      • Fig 2 c: I'm not sure if this panel is very relevant, when looking into detail, opposite pathways are present (glycolysis - gluconeogenesis). As well, I'm not sure if galactose metabolism is truly relevant, unless the author managed to measure distinct hexose and hexose-phosphates? Given the flow injection analysis setup, I doubt this. Would suggest to move this to supplement or to simply leave it out.
      • Fig 3 b: was there any statistics performed on these data to compare the different setups?

      Significance

      the study adds insights to the ongoing research on EC molecular behavior.

      using different types of ECs in both quiescent and proliferating mode, as well as the validation of pathways by introducing inhibitors combined with the sprouting assays is an asset.

      I would like to see stated the biological complexity of EC, it was recently shown that shear stress plays an important role in EC metabolism.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: 
 The study by Durot and colleagues explores the metabolic heterogeneity of endothelial cells (ECs) across distinct subtypes (blood vs. lymphatic) and growth states (proliferating vs. quiescent). Through integrated proteomic and metabolomic analyses, the authors demonstrate that quiescent ECs are not metabolically inactive but instead undergo subtype-specific metabolic reprogramming. Functional perturbation of key metabolic pathways using chemical inhibitors results in differential phenotypic responses in blood versus lymphatic ECs. Collectively, the findings underscore a critical, context-dependent role of metabolism in maintaining EC function and highlight metabolic specialization as a fundamental feature of endothelial diversity.

      General Comments: 
 This manuscript presents a comprehensive and methodologically robust investigation into the metabolic diversity of cultured ECs. By combining proteomic and metabolomic approaches, the authors provide novel insights into the distinct metabolic profiles of blood and lymphatic ECs, and how these profiles shift as ECs transition from a proliferative to a quiescent state. The observation that quiescent ECs exhibit active metabolic reprogramming, rather than simply entering a dormant state, is particularly compelling and challenges existing models of cellular quiescence.

      The work is timely, well-written and addresses a significant gap in our understanding of endothelial metabolism. The integration of large-scale omics data with functional perturbation experiments strengthens the overall conclusions and enhances the impact of the study.

      Nevertheless, while the data are largely convincing, certain experimental aspects-particularly those related to the in vitro sprouting assays-require further validation to solidify the mechanistic interpretations. Additionally, some findings would benefit from further validation using alternative approaches (e.g., chemical perturbation studies).

      Specific Comments:

      1. Image quality in sprouting assays - The images presented for the sprouting assays (e.g., Figure 4) are of suboptimal resolution and quality, making it difficult to evaluate the effects of the various compounds on EC behavior. Even under control conditions, clear sprout-like structures are not readily discernible. Improved image resolution-preferably through high-quality bright-field microscopy-and the inclusion of immunofluorescence images of labeled endothelial spheroids are recommended to enhance interpretability.
      2. Validation of the quiescence model - The current approach to induce quiescence should be further substantiated. Beyond proliferation markers, additional hallmarks of quiescent cells-such as epigenetic signatures, protein quality control mechanisms, and translational activity-should be assessed to confirm that the EC subtypes achieve a bona fide resting state.
      3. Reversibility of quiescence - It is important to demonstrate that the EC subtypes investigated can re-enter the cell cycle following release from contact inhibition. Without such evidence, the possibility remains that some of the observed metabolic features reflect a transition to senescence rather than reversible quiescence.
      4. Assessment of cell viability - While EC proliferation, migration, and sprouting were examined to infer functional roles of metabolic adaptations, analyses of cell viability and death are also necessary to evaluate potential homeostatic or survival-related functions of the observed metabolic changes.
      5. Validation of pharmacological findings - The pharmacological inhibition experiments are informative and constitute a central part of the study. However, given the possibility of off-target effects, key conclusions should be corroborated using alternative loss-of-function approaches, such as RNA interference (e.g., shRNA or siRNA).

      Significance

      In summary, this manuscript makes a substantial contribution to the field and is likely to stimulate further research into endothelial metabolic regulation. With additional experimental validation, the study has the potential to serve as a reference in both vascular and metabolic research.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this paper, the GFP-GBP system for mistargeting protein localization was used in fission yeast cells to discover new protein interactions involved in vesicular trafficking during cytokinesis. This approach uncovered a new association between the F-BAR protein Rga7 and its binding partner Rng10 with the Munc13 protein Ync13 at the cell division site. Additional associations were observed between Rga7-Rng10, Ync13 and the glucan synthases Ags1 and Bgs4, and the vesicle fusion protein Sec1. These interactions identified by the GFP-GBP system were further supported by co-immunoprecipitation experiments and by defining localization dependencies with live cell imaging in a variety of mutant strains. The imaging data are all of high quality and for the most part support the conclusions. However, in my opinion some of the interpretations are overstated, and the manuscript would benefit from providing additional mechanistic information. Major and minor recommendations are outlined below.

      Major suggestions 1. The co-IP data are interpreted to suggest that all the above-mentioned proteins form a single "big complex." However, as noted in the manuscript and reflected in the model, the multipass integral membrane proteins Bgs4 and Ags1 are embedded in the vesicle membrane and likely only indirectly associate with the scaffold Rga7-Rng10 via Ync13, without forming a 'complex'. One would expect the entirety of these vesicle contents to co-IP if the model is correct. The first paragraph of page 11 should be revised to more clearly reflect this scenario and to align with the proposed model.

      Response: We thank the reviewer for this thoughtful clarification. In the original manuscript, we stated “…indicating these proteins do interact or form big protein complexes… These results suggest that Rga7, Rng10, and Ync13 form a protein complex.” We agree that our initial wording may have unintentionally implied that all proteins detected in co-IP experiments assemble into a single, large physical complex. As the reviewer correctly noticed, the multipass integral membrane proteins Bgs4 and Ags1 are embedded within vesicle membranes and are more likely to associate indirectly with the Rga7-Rng10-Ync13 complex, rather than being part of one unified protein complex. To avoid overinterpretation, we have modified the last sentence of the first paragraph on the original page 11 as below: “These results suggest that Rga7, Rng10, and Ync13 do form a protein complex, although maybe dynamic and not super stable (see Discussion). Our data indicate that Rga7 interacts with both Ync13 and Rng10 to form a module on the plasma membrane for targeting of the vesicles containing cargos such as glucan synthases Bgs4 and Ags1. However, these glucan synthases are multipass integral membrane embedded proteins and likely only indirectly associate with the module Rng10-Rga7-Ync13, without forming a big protein complex.”

      Can Ync13 be artificially directed or tethered to the division site independently of Rga7-Rng10 (e.g., via Imp2)? If so, can this rescue the phenotypes of rga7Δ cells? This experiment could clarify whether Ync13 is the key functional effector of the Rga7-Rng10 complex.

      Response: We thank the reviewer for suggesting this interesting experiment. We agree that testing whether correctly localized Ync13 is sufficient to execute the division-site function of the Rga7–Rng10 complex would clarify its role. To test this, we artificially targeted Ync13 to the division site independently of Rga7 by tethering it to the scaffold protein Pmo25. Pmo25, an MO25 family protein, localizes to both the plasma membrane at the division site and the spindle pole body (mainly one of the SPBs) during mitosis and cytokinesis, enabling us to mislocalize Ync13 to these structures through GFP–GBP system. We did not use Imp2 because its localization pattern (mainly to the contractile ring [1, 2]) is different from Ync13. Microscopy revealed robust localization of Ync13 at the division site and the SPB in rga7Δ cells, and this tethered Ync13 persisted along the cleavage furrow throughout ring constriction. Importantly, enforced division-site localization of Ync13 significantly rescued the cytokinesis defects and cell lysis of rga7Δ. Consistently, growth assays on Phloxin B (PB) plate showed the elevated lysis/death in rga7Δ cells was rescued by Ync13 tethering to Pmo25-GBP. Together, these findings support that Ync13 is a key functional effector acting downstream of the Rga7–Rng10 scaffold at the division site. We have added these results in the new Figure 6 and associate text in the revised manuscript. We have also updated the model in Figure 8 to reflect this new result.

      1. Demeter J, Sazer S. imp2, a new component of the actin ring in the fission yeast Schizosaccharomyces pombe. J Cell Biol. 1998;143(2):415-27. PubMed PMID: 9786952.
      2. Martin-Garcia R, Coll PM, Perez P. F-BAR domain protein Rga7 collaborates with Cdc15 and Imp2 to ensure proper cytokinesis in fission yeast. J Cell Sci. 2014;127(Pt 19):4146-58. Epub 2014/07/24. doi: 10.1242/jcs.146233. PubMed PMID: 25052092.
      3. The authors should consider structural or computational modeling of the proposed Rga7-Rng10-Ync13 complex. Such analysis could offer insight into how these components interact and strengthen the proposed model. Response: We thank the reviewer for this valuable suggestion. Following the recommendation, we performed structural modeling of the Rga7–Rng10–Ync13 complex using AlphaFold3. Our previous work demonstrated that the F-BAR protein Rga7 forms a stable dimer and its F-BAR domain binds the C-terminal (aa751–1038) region of Rng10 [3]. Based on these findings, we constructed an input model consisting of two full-length Rga7 subunits, two Rng10(751–1038) subunits, and one full-length Ync13. The predicted structure revealed a modular organization in which Rng10(751–1038) associated strongly with the F-BAR domain of the Rga7 dimer, consistent with our prior biochemical data [3]. In addition, the model suggested that Ync13 interacted with the GAP domain of Rga7, positioning Ync13 in close proximity to the Rga7–Rng10 interface (Fig. S5, A, B, D and F). Further domain specific predictions confirmed the interactions between Rga7-GAP and Ync13 N-terminus (pTM: 0.63, ipTM: 0.64), two Rga7 F-BARs (pTM: 0.74, ipTM: 0.71), as well as Rga7 F-BAR and Rng10(751–1038) (pTM: 0.56, ipTM: 0.78) (Fig. S5, C-F). Overlay analyses revealed that the interacting domains align well with the structure of whole complex as the root mean square differences (RMSDs) are Liu Y, McDonald NA, Naegele SM, Gould KL, Wu J-Q. The F-BAR domain of Rga7 relies on a cooperative mechanism of membrane binding with a partner protein during fission yeast cytokinesis. Cell Rep. 2019;26(10):2540-8.e4. doi: 10.1016/j.celrep.2019.01.112. PubMed PMID: 30840879; PubMed Central PMCID: PMCPMC6425953.

      Minor text edits 1. Define "SIN" in the discussion section for clarity.

      Response: We defined the SIN pathway in the Discussion section as suggested: “At low restrictive temperatures, the lethality of mutant sid2, the most downstream kinase in the Septation Initiation Network, is partially rescued by upregulating Rho1. Thus, it has been suggested that the Septation Initiation Network activates Rho1, which in turn activates the glucan synthases [4].”

      Alcaide-Gavilán M, Lahoz A, Daga RR, Jimenez J. Feedback regulation of SIN by Etd1 and Rho1 in fission yeast. Genetics. 2014;196(2):455-70. Epub 2013/12/18. doi: 10.1534/genetics.113.155218. PubMed PMID: 24336750; PubMed Central PMCID: PMCPMC3914619.

      Figure S3, the protein schematics should start at residue "1" and not "0".

      Response: We apologize for the mistake. The schematics in revised figure (now Figure S4A) have been corrected to start at residue 1.

      Mass spectrometry data referenced in the text are not provided in the manuscript.

      __Response: __We apologize for the omission. The mass spectrometry data are now shown in Table S1. __

      __

      In Figure 4A. The Ags1 rim localization does not appear decreased as the authors claim.

      __Response: __After examining the data again, we agree with the reviewer’s assessment. So, we reworded the sentence as the following: “We also found that in ync13Δ cells, the Bgs4 intensity at the rim of the septum was much lower than in WT after ring constriction (Fig. 4B).”


      On page 13: "both Rga7 and Rng10 can mistarget Trs120 to mitochondria."

      Response: Thank you. The typo “mistargeting” has been corrected to “mistarget”.

      Minor figure edits 1. Consider inverting single-channel images to display fluorescence on a white background, which would improve visual clarity.

      Response: We appreciate the reviewer’s suggestion. However, we have chosen to retain the original display format with fluorescence shown in a black background, to be consistent with our (and some others’) previous publications. We believe this format preserves clarity while allowing easier comparison with the previously published works.

      The Figure 1 legend should describe the experimental setup rather than restating conclusions.

      Response: We thank the reviewer for this helpful suggestion. The Figure 1 legend has been revised to describe the experimental setup and imaging conditions rather than summarizing conclusions as the following:

      Fig. 1. Physical interactions among the key cytokinetic proteins in plasma membrane deposition and septum formation revealed by ectopic mistargeting to mitochondria by Tom20-GBP. __Arrowheads mark examples of colocalization at mitochondria. (A) Ync13 colocalizes with Rga7 and Rng10 at cell tips and the division site. (B-F) Tom20-GBP can ectopically mistarget Rga7/Rng10-mEGFP and their interacting partners tagged with tdTomato/RFP/mCherry to mitochondria. Tom20–GBP was used to recruit mEGFP-tagged Rga7 or Rng10 to mitochondria, and colocalization was assessed with tdTomato/RFP/mCherry-tagged candidate binding partners. Cells were grown at 25ºC in YE5S + 1.2 M sorbitol medium for ~36 to 48 h and then were washed with YE5S without sorbitol and grown in YE5S for 4 h before imaging. (B) Rga7/Rng10-Ync13. (C) Rga7/Rng10-Trs120. (D) Rga7/Rng10-Bgs4. (E) Rga7/Rng10-Ags1. (F)__ Rga7-Smi1. Bars, 5 μm.

      Reduce the number of arrows indicating co-localization in microscopy images; highlighting 1-2 representative examples is sufficient and less visually cluttered.

      Response: We appreciate the reviewer’s suggestion. We have revised the micrographs to reduce the number of arrowheads, highlighting several representative examples of co-localization per image. This improves clarity and reduces visual clutter while still guiding the reader to the key observations.

      Figure 3F, the scale bar is listed as 5 μm in the legend but it appears to my eye to be 2 μm.

      Response: We thank the reviewer for noticing this error. After rechecking the original imaging data, we have added a new 5 μm scale bar.

      The orientation of Bgs4/Smi1 should be inverted in the schematic within vesicles so that Smi1 is always on the cytoplasmic side.

      Response: We thank the reviewer for pointing out this error. The schematic has been corrected so that Bgs4 and Smi1 are oriented appropriately, with Smi1 consistently placed on the cytoplasmic side of vesicles because it does not have a transmembrane domain. The revised schematic is included in the updated Figure 8.

      6. Also in the schematic, Mid1 is not at the constricting CR and therefore needs to be removed.

      __Response: __Thank you for the suggestion. Mid1 has been removed from the model figure.

      Reviewer #1 (Significance (Required): From the data presented in the manuscript, it is proposed that Rga7 and Rng10 form a scaffold at the division site for delivery of exocytic vesicles marked by the TRAPPII complex but not the exocyst complex. Further, it is proposed that these vesicles deliver specifically the glucan synthases necessary for septation. Overall, this study builds on previous work from the Wu lab to clarify how the TRAPPII-decorated vesicles are specifically delivered to the cell division site, adding some new information about vesicle trafficking regulation during cytokinesis. It also provides new insight into the role of a F-BAR scaffold protein.

      This paper will be of interest to those studying cytokinesis and also those studying mechanisms of intracellular trafficking.

      Reviewer expertise: Cell division, signaling, membrane biology

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This paper provides a comprehensive analysis of the roles of Rng10, Rga7, and Ync13 in cytokinesis using fission yeast as a model system. The authors demonstrate that Ync13/Rna7/Rng10 not only interact with each other but also associate with components of glucan synthases, which are essential for secondary septum formation but not for the primary septum. They further show that Ync13 is involved in exocytosis through its interaction with Sec1 and plays a role in membrane trafficking via interaction with the TRAPP-II complex. Collectively, their findings reveal a coordinated mechanism that ensures the timely formation of the secondary septum during cytokinesis, as deletion of these proteins disrupts septum formation and leads to cell lysis.

      The conclusions drawn in this paper are well-supported by the data, with a clear methodology and robust statistical analyses that enhance reproducibility. However, I have the following major and minor comments:

      Major Comments - 1) The authors propose that Ync13, Rng10, and Rga7 interact to form a protein complex, supported by their mislocalization studies. While these findings are suggestive, additional co-immunoprecipitation (co-IP) data specifically demonstrating a direct interaction between Ync13 and Rng10 would strengthen the claim.

      Response: We thank the reviewer for this suggestion. The direct interaction between Rga7 with Rng10 has been already established and published by our group [3, 5]. Here we found that Rga7 and Ync13 directly interact by in vitro binding assay (Figure 2, D and E). While our current data do not suggest a direct physical interaction between Ync13 and Rng10, our mislocalization results and other data do provide strong support for their functional association. In particular, ectopic tethering of Ync13 to mitochondria recruits Rng10 to the same sites and vice versa (Figures. 1B and S2A). Additionally, division-site tethering of Ync13 by Pmo25-GBP rescues both the growth and cell-lysis phenotype of rga7Δ (Figure 6), consistent with the idea that Ync13 functions downstream of Rga7-Rng10 because Rga7 localization depends on Rng10 (Figure 8). Furthermore, our AlphaFold3 modeling predicts that Rng10 binds the BAR domain of Rga7, whereas Ync13 binds the GAP domain of Rga7, suggesting that Rng10 and Ync13 are positioned within the same complex through Rga7 without direct interaction (Figure S5).

              The predicted lack of direct interaction between Ync13 and Rng10(751–1038) is supported by the experiment mentioned below to answer the minor question from the Reviewer 3. We tested the mistargeting of mECitrine-Rng10(751–1038) in *rga7Δ tom20-GBP* cells and found that Ync13-tdTomato could not be recruited to mitochondria (Figure S4H). This indicates that Ync13 cannot interact with Rng10(751–1038) independently of Rga7, supporting our proposed model that Rga7 interacts with Rng10 through the BAR domain while with Ync13 through the GAP domain. We have added these clarifications to the revised manuscript (Results and Discussion) to better contextualize the evidence for the Rga7–Rng10–Ync13 assembly.
      

      Liu Y, McDonald NA, Naegele SM, Gould KL, Wu J-Q. The F-BAR Domain of Rga7 Relies on a Cooperative Mechanism of Membrane Binding with a Partner Protein during Fission Yeast Cytokinesis. Cell Rep. 2019;26(10):2540-8.e4. doi: 10.1016/j.celrep.2019.01.112. PubMed PMID: 30840879; PubMed Central PMCID: PMCPMC6425953. Liu Y, Lee I-J, Sun M, Lower CA, Runge KW, Ma J, et al. Roles of the novel coiled-coil protein Rng10 in septum formation during fission yeast cytokinesis. Mol Biol Cell. 2016;27(16):2528-41. Epub 2016/07/08. doi: 10.1091/mbc.E16-03-0156. PubMed PMID: 27385337; PubMed Central PMCID: PMCPMC4985255.

      2) It remains unclear whether Ync13 directly interacts with components of the glucan synthase complex (Bgs4/Ags1), or if this association is mediated through other factors (Rng10, Rga7). Clarifying the nature of this interaction would significantly enhance the mechanistic insight.

      Response: We thank the reviewer for this thoughtful clarification. As pointed out by Reviewer 1 in major comment 1, the multipass integral membrane proteins Bgs4 and Ags1 are embedded within vesicle membranes and are more likely to associate indirectly with the Rga7–Rng10-Ync13 complex rather than being part of one unified protein complex, although Rga7 Co-IPs with Bgs4 and its binding partner Smi1 (Figure 1, A-C). We would like to make it clear that our model or manuscript does not claim direct interactions between the Ync13-Rga7-Rng10 module and the glucan synthase complexes but suggest that the module aids in selection of vesicle targeting sites on the plasma membrane. To clarify, we have revised the text to more clearly state that our co-IP and in vitro binding results demonstrate that Rga7 physically associates with Ync13 and Rng10, and that vesicle-associated proteins such as Bgs4 and Ags1 are likely recruited through indirect interactions.

      __Minor comments: __1) The manuscript refers to mass spectrometry-based interaction data, but the corresponding dataset is not included. Providing this would enhance transparency and reproducibility.

      __Response: __We apologize for the omission. The mass spectrometry data are now shown in Table S1.

      2) In Figure 2D, the MBP-6x pull-down lane shows a faint band around 76 kDa. The authors should clarify what this band represents and whether it has any relevance to the study.

      Response: We thank the reviewer for noticing this faint band. The weak ~76 kDa band in the MBP-6x pull-down lane is non-specific background binding of MBP and Rga7. We added a note in the figure legend to clarify this point.


      3) A quantification graph corresponding to the data in Figure 3G would aid in better interpreting the results and assessing their significance.

      Response: We thank the reviewer for this suggestion. We have now added two quantification graphs corresponding to Figure 3G, showing the measured Rng10 signal intensities across the division site. Statistical analysis shows the full width at half maximum (FWHM) is significantly different between WT and ync13D cells, and the figure legend and text have been updated accordingly in the revised manuscript.

      4) Figure 4D appears to be missing time legends, which are essential for interpreting the dynamics of the experiment.

      Response: We thank the reviewer for noticing this. We apology for making this confusing statement in figure legend. We would like to clarify that the full width at half maximum (FWHM) was calculated from line scans using single time point images from cells at the end of contractile-ring constriction. Those line scans were fitted with the Gaussian distribution to calculate the mean and standard deviation of FWHM. We have updated the figure legend to make it clearer in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      Nature and Significance of the Advance This study provides a conceptual and mechanistic advance in understanding the spatial and temporal regulation of membrane trafficking during cytokinesis. It identifies a conserved module-Ync13-Rga7-Rng10-that directs the selective tethering and fusion of secretory vesicles at the division site, functioning independently of the exocyst complex. This finding challenges the prevailing model that the exocyst is universally required for vesicle tethering during cytokinesis. While previous work has underscored the roles of TRAPP-II and vesicle trafficking in septum formation (Wang et al., 2016; Arellano et al., 1997; Gerien and Wu, 2018), the precise mechanism targeting vesicles to the division site remained unclear. This study fills that gap by elucidating how Ync13 and Rga7 coordinate vesicle delivery and glucan synthase localization (Liu et al., 2016; Zhu et al., 2018), thereby extending our understanding of septum biogenesis and membrane remodeling beyond actomyosin ring dynamics.

      Relevant Audience: This work is relevant to: • Cell biologists investigating cytokinesis, membrane trafficking, or vesicle fusion. • Yeast geneticists interested in conserved cell division pathways. • Researchers focused on SNARE-mediated membrane dynamics and trafficking regulation. • Biomedical scientists exploring analogous processes in mammalian systems, particularly those studying cell division defects linked to disease. The findings have implications across both basic and translational research in cell biology and membrane dynamics.

      My Expertise: My research focuses on membrane fusion, specifically the SNARE-mediated fusion process. I study the spatio-temporal regulation of fusion events and the coordinated action of regulatory proteins in determining the structural and functional outcomes of membrane fusion. This background provides me with the framework to critically evaluate studies investigating cytokinesis and trafficking mechanisms at the molecular level.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Zhang et al. elucidate key roles of a conserved module the Ync13-Rga7-Rng10 complex in coordinating selective tethering, docking, and fusion of glucan synthases containing vesicles with the plasma membrane, a process crucial for cell wall synthesis and survival of fission yeast at division. Using methods including mistargeting proteins to mitochondria, co-immunoprecipitation, in vitro binding assays, genetic and cellular methods, electron microscopy, and live-cell confocal microscopy, the authors demonstrate that this module controls a vesicle targeting pathway mediated by the TRAPP-II complex and SM protein Sec1, which ensures glucan synthases Bgs4 and Ags1 are deposited at the division site in a spatiotemporal manner.

      Major comments: The authors report aberrant accumulation of Bgs4 and Ags1 in the center of the septum after actomyosin ring constriction in ync13del cells and detect no overall defects in Bgs1 distribution there (Figure 4). When similar experiments were analyzed in this paper ( https://pmc.ncbi.nlm.nih.gov/articles/PMC6249806/), Bgs1 distribution and level did change in cells lacking Ync13, although these phenotypes of Bgs1 appeared later that that of Bgs4. I wonder whether there could exist a second wave of Bgs1 arrival in ync13del cells at later time points after ring fully constricts. Could this late recruitment of Bgs1 depends on Rng7 and Rng10, since these protein complexes are enriched in the middle of septum of ync13del cells? Or as the authors mentioned in the Discussion, could Rho GTPase regulated by Rga7 GAP also play a role in Bgs1 accumulation or fusion with the septum in the above scenario, if no obvious accumulation of vesicles is observed in ync13del cells with electron microscopy? How does Bgs1 localize in ync13-19 rng10del double?

      Response: We thank the reviewer for this insightful observation. We repeated the experiment to observe the localization of Bgs1 in WT and ync13Δ cells. We confirmed our earlier observation reported in this manuscript that the localization of Bgs1 at rim of the division site and its distribution along the division plane in ync13Δ is not very different from WT, although its intensity is higher and has more variation in ync13Δ cells (Figure above) . As suggested by the reviewer, we did microscopy to test Bgs1 localization in ync13-19 temperature sensitive mutant, rng10Δ, ync13-19 rng10Δ, and WT (Fig. S7). While line scan curves for Bgs1 localization at the division site steep for ync13-19 rng10Δ double mutant, it has no statistically significant difference in FWHM as compared to control WT (Fig. S7). Please note that we used different confocal systems, cameras, and laser powers for Fig. 4, C and E (PerkinElmer UltraVIEW Vox CSUX1) and Fig. S7 (Nikon W1+SoRa), so the FWHMs are not comparable between the two figures.

      To test if there is any second wave of Bgs1 localization at the division site, we tracked the fluorescence intensity of Bgs1 throughout 2 h long movies and plotted the Bgs1 intensity profile at the division site over time. The data clearly show only one peak of Bgs1 and no later accumulation at the division site, although Bgs1 intensity has more variation in ync13-19 and ync13-19 rng10Δ cells and the intensity is higher in ync13-19 rng10Δ cells. All these experiments conclude that Ync13-Rga7-Rng10 module impacts the localization of glucan synthases essential for the secondary septum (Bgs4 and Ags1) but not the primary (Bgs1).

      Assessments of protein abundance by Western blotting (Figure 3C and 3D) can benefit from some quantifications.

      Response: We thank the reviewer for this suggestion. We have now quantified the Western blot bands in Figures 3C and 3D, which have been added as supplementary figures along with the Western blot for Rng10 (Fig. S6, A-C) in the revised figures.

      Minor comments: Based on a series of experiments in which mistargeting Rga7 and Rng10 truncations drive Ync13-tdTomato to mitochondria, the authors suggest that Rga7, Rng10, and Ync13 have multivalent interactions with each other. Previous study (https://pmc.ncbi.nlm.nih.gov/articles/PMC6425953/) demonstrated that in cells co-expressing Tom20-GBP mECitrine-Rng10(751-950), Rga7 was efficiently mistargeted to mitochondria. This raises a possibility that Ync13 mistargeted by mECitrine-Rng10(751-1038) could come from Rga7 that strongly associated with Rng10(751-1038) on mitochondria. I wonder whether the authors could compare some of their truncation mistargeting experiments in the original manuscript and the ones in which either Rga7 or Rng10 is deleted, e.g. Tom20-GBP mECitrine-Rng10(751-1038) experiments in rga7del cells, if cells are still viable in this genetic background.

      Response: We thank the reviewer for this insightful suggestion. We tested the mistargeting of mECitrine-Rng10(751–1038) in rga7Δ tom20-GBP cells and found that Ync13-tdTomato could not be recruited to mitochondria. This indicates that Ync13 cannot interact with Rng10 C-terminus independently of Rga7, supporting the Alphafold3 modeling and our proposed model that Rga7 interacts with Rng10 through the BAR domain while with Ync13 through the GAP domain. We have added the new data to the revised manuscript (Fig. S4H and associate text) and included a brief discussion highlighting that Rga7 is required for the Rng10–Ync13 interaction. We removed the mentioning of multivalent interactions in the manuscript to minimize confusion.

      It is interesting that rga7del rng10del double mutants can survive better in EMM or YES with sorbitol. I wonder this would allow the authors to test whether the interaction between Ync13 and Sec1 is modulated by the presence of Rga7 and Rng10 or even the entire vesicle? Does mistargeted Ync13 overexpressed using the 3nmt1 promoter is still capable of driving Sec1 to mitochondria in rga7del rng10del cells.

      Response: We thank the reviewer for this suggestion. While we did not succeed in constructing the pentamutant deleting both rga7 and rng10 and mislocalizing Ync13 to mitochondria, we were able to make a quadruple mutant deleting rng10 and mislocalizing Ync13 to mitochondria. We tested whether mistargeted Ync13 overexpressed using the 3nmt1 promoter can recruit Sec1 to mitochondria in rng10Δ cells. Our results show that overexpressed Ync13 is still able to drive Sec1 localization to mitochondria without Rng10 (Fig. S2G). This suggests that Rng10 (together with Rga7) primarily functions to recruit and position Ync13 at the division site rather than being strictly required for the interaction between Ync13 and Sec1. This is also consistent with our Pmo25-GBP mislocalization experiments where we found that rga7Δ 3nmt1-mECitrine-ync13 cells even under the repressed condition for the 3nmt1 promoter can partially rescue the lysis phenotype of rga7Δ cells (Figure 6).

      The endogenous level of Ync13 is not particular high. Is this low level of Ync13 crucial for its function? Does mildly elevated level of Ync1 promote vesicle fusion at the closing septum?

      Response: We thank the reviewer for this insightful question. To test if there is a correlation between Ync13 levels and vesicle fusion at the division site, we mildly overexpressed Ync13 from the 3nmt1 promoter in YE5S rich medium without additionally added thiamine to obtain cells with different Ync13 levels (the rich medium has some residual amount of thiamine, which partially represses the nmt1 promoter) and then tracked the Rab11 GTPase Ypt3 labeled vesicles. This resulted in increased levels of Ync13 as well as Ypt3 at the division site (Fig. S8B). We measured the Ync13 intensity at division site and counted the number of Ypt3 vesicles reaching the division site in 2-minute continuous movie at the middle focal plane. We observed that increasing Ync13 level promoted the tethering and accumulation of Ypt3 vesicles at the division site until it reached a plateau (Fig. S8B). Thus, the Ync13 level is important for vesicle fusion at the division site. Collectively, Ync13, working with Rga7 and Rng10, plays an important role in vesicle targeting and fusion on the plasma membrane at the division site during cytokinesis. This is consistent with our results that overexpressed Ync13 can mislocalize Sec1 to mitochondria in rng10Δ (Fig. S2G) and can rescue the rga7Δ (Fig. 6).

      Reviewer #3 (Significance (Required)):

      Most of conclusions are well supported by a combination of methods. Out of curiosity, I wonder how much of Bgs4 or Smi1 detected in Co-IP experiments exist in the vesicle-bound form. The authors propose a very interesting working model that addresses several key challenges in achieving vesicle targeting specificity when timely delivery of various enzymes to their respective spatial locations along the primary and secondary septum must be orchestrated. I think this manuscript will be of interest to a broad audience.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Zhang et al. elucidate key roles of a conserved module the Ync13-Rga7-Rng10 complex in coordinating selective tethering, docking, and fusion of glucan synthases containing vesicles with the plasma membrane, a process crucial for cell wall synthesis and survival of fission yeast at division. Using methods including mistargeting proteins to mitochondria, co-immunoprecipitation, in vitro binding assays, genetic and cellular methods, electron microscopy, and live-cell confocal microscopy, the authors demonstrate that this module controls a vesicle targeting pathway mediated by the TRAPP-II complex and SM protein Sec1, which ensures glucan synthases Bgs4 and Ags1 are deposited at the division site in a spatiotemporal manner.

      Major comments:

      The authors report aberrant accumulation of Bgs4 and Ags1 in the center of the septum after actomyosin ring constriction in ync13del cells and detect no overall defects in Bgs1 distribution there (Figure 4). When similar experiments were analyzed in this paper ( https://pmc.ncbi.nlm.nih.gov/articles/PMC6249806/), Bgs1 distribution and level did change in cells lacking Ync13, although these phenotypes of Bgs1 appeared later that that of Bgs4. I wonder whether there could exist a second wave of Bgs1 arrival in ync13del cells at later time points after ring fully constricts. Could this late recruitment of Bgs1 depends on Rng7 and Rng10, since these protein complexes are enriched in the middle of septum of ync13del cells? Or as the authors mentioned in the Discussion, could Rho GTPase regulated by Rga7 GAP also play a role in Bgs1 accumulation or fusion with the septum in the above scenario, if no obvious accumulation of vesicles is observed in ync13del cells with electron microscopy? How does Bgs1 localize in ync13-19 rng10del double?

      Assessments of protein abundance by Western blotting (Figure 3C and 3D) can benefit from some quantifications.

      Minor comments:

      Based on a series of experiments in which mistargeting Rga7 and Rng10 truncations drive Ync13-tdTomato to mitochondria, the authors suggest that Rga7, Rng10, and Ync13 have multivalent interactions with each other. Previous study (https://pmc.ncbi.nlm.nih.gov/articles/PMC6425953/) demonstrated that in cells co-expressing Tom20-GBP mECitrine-Rng10(751-950), Rga7 was efficiently mistargeted to mitochondria. This raises a possibility that Ync13 mistargeted by mECitrine-Rng10(751-1038) could come from Rga7 that strongly associated with Rng10(751-1038) on mitochondria. I wonder whether the authors could compare some of their truncation mistargeting experiments in the original manuscript and the ones in which either Rga7 or Rng10 is deleted, e.g. Tom20-GBP mECitrine-Rng10(751-1038) experiments in rga7del cells, if cells are still viable in this genetic background.

      It is interesting that rga7del rng10del double mutants can survive better in EMM or YES with sorbitol. I wonder this would allow the authors to test whether the interaction between Ync13 and Sec1 is modulated by the presence of Rga7 and Rng10 or even the entire vesicle? Does mistargeted Ync13 overexpressed using the 3nmt1 promoter is still capable of driving Sec1 to mitochondria in rga7del rng10del cells.

      The endogenous level of Ync13 is not particular high. Is this low level of Ync13 crucial for its function? Does mildly elevated level of Ync1 promote vesicle fusion at the closing septum?

      Significance

      Most of conclusions are well supported by a combination of methods. Out of curiosity, I wonder how much of Bgs4 or Smi1 detected in Co-IP experiments exist in the vesicle-bound form. The authors propose a very interesting working model that addresses several key challenges in achieving vesicle targeting specificity when timely delivery of various enzymes to their respective spatial locations along the primary and secondary septum must be orchestrated. I think this manuscript will be of interest to a broad audience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This paper provides a comprehensive analysis of the roles of Rng10, Rga7, and Ync13 in cytokinesis using fission yeast as a model system. The authors demonstrate that Ync13/Rna7/Rng10 not only interact with each other but also associate with components of glucan synthases, which are essential for secondary septum formation but not for the primary septum. They further show that Ync13 is involved in exocytosis through its interaction with Sec1 and plays a role in membrane trafficking via interaction with the TRAPP-II complex. Collectively, their findings reveal a coordinated mechanism that ensures the timely formation of the secondary septum during cytokinesis, as deletion of these proteins disrupts septum formation and leads to cell lysis.

      The conclusions drawn in this paper are well-supported by the data, with a clear methodology and robust statistical analyses that enhance reproducibility. However, I have the following major and minor comments:

      Major Comments

      1. The authors propose that Ync13, Rng10, and Rga7 interact to form a protein complex, supported by their mislocalization studies. While these findings are suggestive, additional co-immunoprecipitation (co-IP) data specifically demonstrating a direct interaction between Ync13 and Rng10 would strengthen the claim.
      2. It remains unclear whether Ync13 directly interacts with components of the glucan synthase complex (Bgs4/Ags1), or if this association is mediated through other factors (Rng10, Rga7). Clarifying the nature of this interaction would significantly enhance the mechanistic insight.

      Minor comments:

      1. The manuscript refers to mass spectrometry-based interaction data, but the corresponding dataset is not included. Providing this would enhance transparency and reproducibility.
      2. In Figure 2D, the MBP-6x pull-down lane shows a faint band around 76 kDa. The authors should clarify what this band represents and whether it has any relevance to the study.
      3. A quantification graph corresponding to the data in Figure 3G would aid in better interpreting the results and assessing their significance.
      4. Figure 4D appears to be missing time legends, which are essential for interpreting the dynamics of the experiment.

      Significance

      Nature and Significance of the Advance

      This study provides a conceptual and mechanistic advance in understanding the spatial and temporal regulation of membrane trafficking during cytokinesis. It identifies a conserved module-Ync13-Rga7-Rng10-that directs the selective tethering and fusion of secretory vesicles at the division site, functioning independently of the exocyst complex. This finding challenges the prevailing model that the exocyst is universally required for vesicle tethering during cytokinesis. While previous work has underscored the roles of TRAPP-II and vesicle trafficking in septum formation (Wang et al., 2016; Arellano et al., 1997; Gerien and Wu, 2018), the precise mechanism targeting vesicles to the division site remained unclear. This study fills that gap by elucidating how Ync13 and Rga7 coordinate vesicle delivery and glucan synthase localization (Liu et al., 2016; Zhu et al., 2018), thereby extending our understanding of septum biogenesis and membrane remodeling beyond actomyosin ring dynamics.

      Relevant Audience:

      This work is relevant to:

      • Cell biologists investigating cytokinesis, membrane trafficking, or vesicle fusion.
      • Yeast geneticists interested in conserved cell division pathways.
      • Researchers focused on SNARE-mediated membrane dynamics and trafficking regulation.
      • Biomedical scientists exploring analogous processes in mammalian systems, particularly those studying cell division defects linked to disease. The findings have implications across both basic and translational research in cell biology and membrane dynamics.

      My Expertise:

      My research focuses on membrane fusion, specifically the SNARE-mediated fusion process. I study the spatio-temporal regulation of fusion events and the coordinated action of regulatory proteins in determining the structural and functional outcomes of membrane fusion. This background provides me with the framework to critically evaluate studies investigating cytokinesis and trafficking mechanisms at the molecular level.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, the GFP-GBP system for mistargeting protein localization was used in fission yeast cells to discover new protein interactions involved in vesicular trafficking during cytokinesis. This approach uncovered a new association between the F-BAR protein Rga7 and its binding partner Rng10 with the Munc13 protein Ync13 at the cell division site. Additional associations were observed between Rga7-Rng10, Ync13 and the glucan synthases Ags1 and Bgs4, and the vesicle fusion protein Sec1. These interactions identified by the GFP-GBP system were further supported by co-immunoprecipitation experiments and by defining localization dependencies with live cell imaging in a variety of mutant strains. The imaging data are all of high quality and for the most part support the conclusions. However, in my opinion some of the interpretations are overstated, and the manuscript would benefit from providing additional mechanistic information. Major and minor recommendations are outlined below.

      Major suggestions

      1. The co-IP data are interpreted to suggest that all the above-mentioned proteins form a single "big complex." However, as noted in the manuscript and reflected in the model, the multipass integral membrane proteins Bgs4 and Ags1 are embedded in the vesicle membrane and likely only indirectly associate with the scaffold Rga7-Rng10 via Ync13, without forming a 'complex'. One would expect the entirety of these vesicle contents to co-IP if the model is correct. The first paragraph of page 11 should be revised to more clearly reflect this scenario and to align with the proposed model.
      2. Can Ync13 be artificially directed or tethered to the division site independently of Rga7-Rng10 (e.g., via Imp2)? If so, can this rescue the phenotypes of rga7Δ cells? This experiment could clarify whether Ync13 is the key functional effector of the Rga7-Rng10 complex.
      3. The authors should consider structural or computational modeling of the proposed Rga7-Rng10-Ync13 complex. Such analysis could offer insight into how these components interact and strengthen the proposed model.

      Minor text edits

      1. Define "SIN" in the discussion section for clarity.
      2. Figure S3, the protein schematics should start at residue "1" and not "0".
      3. Mass spectrometry data referenced in the text are not provided in the manuscript.
      4. In Figure 4A. The Ags1 rim localization does not appear decreased as the authors claim.
      5. On page 13: "both Rga7 and Rng10 can mistarget Trs120 to mitochondria."

      Minor figure edits

      1. Consider inverting single-channel images to display fluorescence on a white background, which would improve visual clarity.
      2. The Figure 1 legend should describe the experimental setup rather than restating conclusions.
      3. Reduce the number of arrows indicating co-localization in microscopy images; highlighting 1-2 representative examples is sufficient and less visually cluttered.
      4. Figure 3F, the scale bar is listed as 5 μm in the legend but it appears to my eye to be 2 μm.
      5. The orientation of Bgs4/Smi1 should be inverted in the schematic within vesicles so that Smi1 is always on the cytoplasmic side.
      6. Also in the schematic, Mid1 is not at the constricting CR and therefore needs to be removed.

      Significance

      From the data presented in the manuscript, it is proposed that Rga7 and Rng10 form a scaffold at the division site for delivery of exocytic vesicles marked by the TRAPPII complex but not the exocyst complex. Further, it is proposed that these vesicles deliver specifically the glucan synthases necessary for septation. Overall, this study builds on previous work from the Wu lab to clarify how the TRAPPII-decorated vesicles are specifically delivered to the cell division site, adding some new information about vesicle trafficking regulation during cytokinesis. It also provides new insight into the role of a F-BAR scaffold protein.

      This paper will be of interest to those studying cytokinesis and also those studying mechanisms of intracellular trafficking.

      Reviewer expertise: Cell division, signaling, membrane biology

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03094

      Corresponding author(s): Saurabh S. Kulkarni

      1. General Statements

      We thank the reviewers for their strong praise of the manuscript, highlighting its rigor, depth, and conceptual importance. They consistently described the study as a beautiful, fascinating, and conceptually strong piece of work that addresses a timely question in multiciliated cells. They also noted the high quality of the data, careful quantification, and the use of multiple genetic and pharmacological approaches, all of which improve the reproducibility and credibility of the findings. Importantly, they emphasized the novelty of discovering a direct mechanistic link between Piezo1-mediated mechanotransduction and Foxj1-driven transcriptional control of multiciliation, representing a significant breakthrough for both the cilia field and mechanobiology more broadly. Collectively, these strengths highlight the manuscript’s wide impact and make it highly suitable for publication in a high-impact journal.

      2. Description of the planned revisions

      Reviewer #1:


      There are two experiments that would significantly strengthen these claims.

      • First if their model is correct then even short term treatment with Yoda1 should induce the pathway and effect centriole numbers. While I appreciate the challenge of long term Yoda1 treatment its not clear to me why it would be needed if short term treatment is setting off the transcriptional cascade. Yoda is used throughout the paper to induce all the pathways but we don't know if it actually induces the phenotype. I think this should be addressed with either short term treatments or a dose response to find a dose that does not lead to skin pealing. It is hard to ignore this obvious deficiency.
      • Second, the model predicts that all of this is to regulate Foxj1 levels to regulate the subtle balance between cell size and centriole number. If this is correct, then the overexpression of Foxj1 should have a profound effect on centriole number in multiciliated cells. This is such an easy experiment that would validate many of the claims. RESPONSE:

      We recognize that the reviewer is asking us to test the sufficiency of the pathway with these comments: “If their model is correct, then they should be able to activate the pathway in one way or another to stimulate centriole number. This is a significant limitation to their overall model.” And “If this is correct, then the overexpression of Foxj1 should have a profound effect on centriole number in multiciliated cells.”

      To address reviewers’ suggestions, we will perform the following experiments.

      1. A brief exposure (15 and 30 mins) to Yoda1 and wait for 3 hours to examine changes in centriole amplification. This will avoid skin peeling from long-term exposure.
      2. A brief exposure to Yoda1 (15 mins) followed by a 30-minute wait period, and the cycle repeats a total of 4 times for a total of 3 hours to examine centriole amplification.
      3. The above two experiments will also be done in a constitutively active-Yap background to increase the probability that synergistic activation can lead to centriole amplification.
      4. Although Foxj1 is essential for multiciliogenesis, it is not sufficient to induce multiciliogenesis, as shown by multiple previous studies. Therefore, we do not expect overexpression of Foxj1 to have a profound effect on centriole number. While we will conduct the experiments because we truly want to address the suggestions and gain insight into the answers ourselves, we respectfully ask the Reviewer to consider the following responses to their concern.

      Yoda1 sufficiency: We agree that testing whether acute Yoda1 treatment can induce centriole amplification is an important question. We will conduct experiments with short-pulse and cyclic Yoda1 exposure, including in a constitutively active-YAP background (listed above), to address this possibility. However, several challenges complicate interpretation: (i) PIEZO1 adapts and desensitizes upon activation, (ii) transient signaling may be sufficient to cause secondary signaling but insufficient to drive stable transcriptional programs required for amplification, and (iii) centriole number is inherently variable, making modest effects difficult to resolve. However, we must recognize that failure to observe sufficiency under these conditions would not invalidate the model for two reasons: 1) absence of evidence is not evidence of absence, and thus, we may not have found the right experimental design. 2) PIEZO1–YAP is a necessary input but not sufficient on its own, as elaborated below. For both reasons, we are very careful about the interpretation of results in the manuscript, which shows that this pathway is necessary for centriole amplification using loss-of-function approaches.

      Foxj1 overexpression: Foxj1 is a well-established regulator essential for motile and multiciliogenesis across species (Xenopus, zebrafish, mouse). Loss of Foxj1 reduces cilia number in MCCs, but its activation alone does not have a profound effect on ciliogenesis/cilia number in MCCs. This is because Foxj1 is a part of a larger network essential for multiciliogenesis. This parallels the behavior of other transcriptional regulators, such as Myb, where loss of function impairs centriole amplification, but overexpression does not drive the formation of supernumerary centrioles. Both studies are seminal discoveries in the field of ciliogenesis, but they did not demonstrate the sufficiency of these molecules/pathways. Thus, our results, demonstrating that Foxj1 is necessary to induce tension-dependent centriole amplification, are significant, as the reviewer mentioned. The lack of Foxj1 sufficiency to induce centriole amplification is not a deficiency of the study, but rather evidence that Foxj1 is a part of a larger network essential for tension-dependent centriole amplification.

      Necessity versus sufficiency: We respectfully emphasize that sufficiency is not a prerequisite for demonstrating the significance of a pathway. Mechanochemical signaling is inherently complex, involving many mechanosensitive proteins and pathways. In our case, mechanical stretch increases centriole amplification, with PIEZO1–YAP signaling identified as a key mediator. However, we do not claim that PIEZO1–YAP alone is sufficient. Other pathways, including cadherin-mediated junctions, F-actin–myosin contractility, integrin–focal adhesion signaling, and nuclear mechanotransduction, likely contribute and may regulate unique downstream effectors that collectively promote centriole amplification. Therefore, PIEZO1–YAP should be regarded as one essential component within a larger network.


      __TIMELINE: __We will perform these additional proposed experiments. Since the first author, a postdoctoral researcher on this manuscript, has started a new job and will be coming in on weekends to complete the experiments, we estimate it will take approximately 2-3 months to finish them.


      Reviewer #2:

      1. Considering the Yap-piezo mechanism of action, the authors' logic for the selection of myb, foxj, plk4 and ccno as transcriptional targets is clear, but the HCR-derived signal and the differences seen in the yap morphants are not very strong, notwithstanding the statistical significance. There appear to be distinct subgroups within the treated populations (in Figure S6B, although these data seem quite different in Fig. 7H, so a comment on the technical differences might be helpful), so that the extent to which Yap1 regulates (Myb-)Foxj1 expression in MCCs is not clearly demonstrated by this experiment. Related to this point, it is unclear why 20-25% of the yap1/ piezo1 MO-treated embryos do not show a decline in FOXj1 in Fig. 6, given the qualitative nature of the scoring. Assuming the KD penetrance would vary on a cell-to-cell basis, rather than an embryo-to-embryo basis, this may suggest that there are additional relevant targets (some of which are discussed by the authors). Single-cell analysis might be a way to address this; however, this is not a trivial experiment, it might be sufficient to include a caveat in the text. Furthermore, the conclusion that Foxj1 regulates centriole amplification in a tension-dependent manner is well-supported by the data.

      RESPONSE: We appreciate the reviewer’s thoughtful observation. Differences in the expression of Foxj1 from experiment to experiment are possible due to a combination of factors, including heterogeneity in MCC development across embryos, slightly different embryonic stages, differences in embryo quality between fertilizations, and variability in morpholino delivery and knockdown penetrance, which can occur both across embryos and on a cell-to-cell basis within an embryo. We also note that technical aspects of HCR RNA-FISH, such as proteinase K treatment and washing steps, can affect signal intensity, potentially contributing to the appearance of distinct subgroups within treated populations.

      We agree that single-cell analysis would be a powerful way to dissect these differences, but as the reviewer notes, this is not a trivial experiment and is beyond the scope of the present study. We have therefore added clarifications in the text and discussion to acknowledge these sources of variability and to highlight the possibility of parallel pathways regulating foxj1 expression.

      ********************************************

      Controls for the knockdowns by the various MOs should be provided.

      RESPONSE: We appreciate the reviewer’s comment. The piezo1 MO has been previously established in Kulkarni et al. (2021). Additionally, the current manuscript includes MO control experiments for both erk2 and yap1, through KD at the 1-cell stage using the MO oligonucleotide, followed by mosaic-rescue with the respective WT RNA constructs (mCherry-ERK2 and yap1-GFP) and a nuclear tracer molecule such as H2B-RFP (Fig. 5, E-H, Fig. S5, C&D, Fig. 3, D-F). The mosaic-rescue is a robust experiment that provides an internal control within the same embryo, thereby avoiding differences that may arise due to embryo-to-embryo variability, embryo quality, or differences in fertilization batches. This approach also serves as a valuable tool for detecting cell-autonomous effects, providing a clear readout against uninjected neighboring cells, as the injected cells are labeled with a tracer. We will perform a similar mosaic-rescue experiment for the foxj1 MO.

      TIMELINE: We will conduct mosaic-rescue experiments for the foxj1 MO. We will need 1 month to complete the experiment.

      ********************************************

      __Minor comments:

      __

      Autocorrection of ERK1/2 or MEK1/2 pathways to 1/2 should be avoided. – We are unclear on this comment. Can reviewer please clarify what they mean.


      Reviewer # 3

      Major concerns

      1- The presented data do not yet establish a specific, direct pathway linking mechanotransduction to centriole number, because the molecular players tested (PIEZO1, Ca²⁺, PKC, ERK, YAP, Foxj1) are highly pleiotropic. As such, the observed centriole number phenotypes, and some of the major conclusions, could be indirect. It is therefore critical to test the specificity and causality of the proposed pathway. This could be done with the authors' own strategies and/or with the following potential approaches:

      • Genetic dependency and sufficiency tests: It could be shown that Yoda1 has no effect in PIEZO1 loss-of-function MCCs, and that wild-type PIEZO1, but not conductance-ad PIEZO1 pore mutants restores Yoda1 responsiveness across centriole number, pERK, and YAP readouts. For example, PIEZO1 C terminus was shown to govern Ca²⁺ influx and ERK1/2 activation. Comparing full length PIEZO1 with a C terminal deletion in MCC restricted rescue; loss of rescue of centriole amplification and ERK/YAP activation with the C terminal deletion can provide a genetics anchored specificity test beyond broad inhibitors.

      RESPONSE:

      • To address the reviewer’s concern, we will test whether Yoda1 affects ERK and Yap activation when Piezo1 is depleted. We appreciate the reviewer’s thoughtful suggestion to employ genetic rescue experiments with Piezo1 mutants. Unfortunately, these are not technically feasible in Xenopus, as the Piezo1 coding sequence is exceptionally large (~7.5 kb)____, and repeated attempts by our group to generate and express stable, translatable transcripts have been unsuccessful. To address genetic dependency and specificity despite these technical barriers, we have employed a combination of orthogonal strategies that together provide strong genetic and mechanistic evidence:

      • Mosaic loss-of-function experiments (Fig. 1) demonstrate that Piezo1 regulates centriole number in a cell-autonomous manner, ruling out global epithelial or indirect tissue-wide effects.

      • Pharmacological activation/inhibition with Piezo1-specific agonist (Yoda1) and inhibitors (GSMTx4, gadolinium) produced consistent phenotypes, including activation of downstream ERK and YAP readouts. Notably, Yoda1 is a Piezo-specific agonist, not a broad pharmacological agent.
      • Downstream pathway dissection (calcium chelation, PKC inhibition, ERK2 depletion, and YAP1 knockdown/rescue) consistently converges on the same phenotypes, reduced centriole amplification and altered Foxj1 expression, providing multiple independent lines of evidence that the Piezo1–Ca²⁺–PKC–ERK–YAP axis specifically controls centriole number.
      • Positive feedback regulation of Piezo1 expression by YAP/Foxj1 (Fig. 7) further strengthens the argument for a pathway-specific role rather than pleiotropic, indirect effects. Taken together, while full-length Piezo1 rescue experiments are technically not possible in Xenopus due to gene size constraints, our data employ state-of-the-art genetic, pharmacological, and orthogonal functional assays to rigorously test pathway specificity. These complementary approaches provide compelling evidence for the causal role of Piezo1-mediated mechanotransduction in centriole number control in MCCs.

      • Downstream bypass/rescue experiments: In PIEZO1 loss-of-function or BAPTA conditions, can enforcing MEK/ERK activation or YAP rescue centriole number defect? Conversely, can MEK inhibitors block Yoda1-induced effects.

      RESPONSE: We appreciate the reviewer’s insightful questions.

      • We will express CA Yap in the Piezo1 KD background to assess if we can rescue centriole number. We also note that the converse experiment has already been performed in our study: 1) PKC inhibition abolishes Yoda1-induced ERK phosphorylation and nuclear localization (Fig. 2), 2) both MEK inhibition and ERK2 depletion block Yoda1-induced Yap activation and nuclear entry (Figs. 4, S2). Thus, we have directly demonstrated that MEK inhibition prevents Yoda1-induced effects, satisfying this aspect of the reviewer’s concern.

      ********************************************

      2- Image quantification and analysis must be described in greater detail in the Methods section, as they are central to the major conclusions of the manuscript. For example, the authors should explain how nuclear, cytoplasmic, and centriole segmentation were performed, and how relative protein levels in the nucleus versus the cytoplasm (e.g., YAP, volume- or area-based) were quantified. Specifically, the thresholds and segmentation criteria applied to different cellular structures under various conditions, as well as the use of Imaris and other software, should be clearly detailed.

      RESPONSE: We will describe the methods in greater detail.

      ********************************************

      3- PIEZO1 mRNA was shown to incrase in a Foxj1 linked feedback loop. Does this increase translate into an increase in total protein levels?

      RESPONSE: If the reviewer is referring to Figure 7B, that is the Piezo1 antibody, so yes, the Piezo1 protein levels have increased.

      If the reviewer is referring to Figure 7C and D, we show that loss of Foxj1 leads to a reduction in Piezo1 mRNA expression.

      ********************************************

      4- Is the proposed signaling cascade active in mammalian multiciliated cells (e.g., airway epithelium). If possible, testing this by using one of the major players of the pathway as a readout such as as ERK phosphorylation, YAP nuclear localization in mammalian MCCs will reveal whether regulation of centriole number through this pathway is conserved and would strengthen the generality.


      RESPONSE: We agree with the reviewer that testing conservation of this pathway in mammalian MCCs is of great interest. Indeed, another group is currently investigating the role of Yap in the mammalian airway epithelium; in their temporally controlled Yap knockout model (the global Yap KO being embryonic lethal), they observed that Yap loss led to a reduction in centriole number. To avoid overlap and direct competition with this ongoing work, we chose to focus our efforts on Xenopus.

      Importantly, Xenopus has become a widely recognized and powerful system for MCC biology, enabling mechanistic dissection of centriole amplification and ciliogenesis. Several key discoveries in the field, including the identification of MCIDAS as a master regulator of MCC fate, were first made in Xenopus before being validated in mammals. Similarly, our study provides a mechanistic framework in Xenopus that can inform and guide ongoing studies in the mammalian airway.

      ********************************************

      5- Throughout the results section, there are multiple times where authors raised specific hypothesis about their data (e.g. foxj1 regulation of number control, apical actin/YAP). However, they have not tested them. These hypothesis are very exciting and if possible, testing experimentally, would strengthen the conclusions associated with them.

      RESPONSE: We are not sure what the reviewer means here by “authors raised specific hypothesis about their data (e.g., foxj1 regulation of number control, apical actin/YAP). However, they have not tested them”,

      BECAUSE:

      • Foxj1 regulation of centriole number: We demonstrate a clear reduction in centriole number upon Foxj1 depletion, and importantly, we extend this finding by showing that the reduction is tension-dependent (Fig. 6). We will perform a rescue assay to demonstrate the specificity.
      • Foxj1 and YAP: We never claimed that Foxj1 regulates YAP expression, and this is not part of our proposed model. Instead, our data show that Piezo1–ERK–YAP signaling regulates Foxj1
      • Foxj1 and apical actin: Foxj1 regulation of apical F-actin has already been established in prior work, and in our study, we clearly observe reduced apical actin intensity in Foxj1-depleted MCCs (Fig. 6). To further strengthen this conclusion, we will provide a quantitative analysis of apical actin intensity in Foxj1 morphants. ********************************************

      __TIMELINE: __We will perform these additional proposed experiments. Since the first author, a postdoc on this manuscript, has started a new job and will be coming in on weekends to finish the experiments, we estimate it will take approximately 2-3 months to complete them.

      Minor comments

      MCC vs non MCC identification (Fig. 1): Clarify how non MCCs were distinguished from MCCs (e.g. markers/criteria). – Can the reviewer please clarify which panel or panels? Or provide more specific text that needs to be changed.

      Add the Kintner group reference linking motile cilia number and centriole number in Xenopus MCCs.– Can the reviewer clarify where and which reference? Thank you.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer 2

      Major comments:

      1. It should be clarified whether the immunoblots and the related quantitations in Figs. 2 and S2 are all from separate blots/ exposures. If so, they are not useful as controls, and these blots should be repeated with the relevant samples analyzed in parallel. Size markers and labels should be included (2B, 2G; S2B and S2G). An increase in total ERK would alter the interpretation of the increase in nuclear pERK in the IF experiments. RESPONSE: We thank the reviewer for raising this important point regarding clarification of the immunoblots. All experimental groups were analyzed in parallel with their corresponding controls. Because the primary antibodies for pERK and ERK were both raised in rabbit, we optimized our workflow to prevent protein loss during stripping and to ensure accurate visualization. Specifically, lysates from each experimental group were loaded in duplicate on the same gel, separated by a molecular weight ladder that served as a reference point. After transfer, the blot was cut along the ladder, and the two halves were processed in parallel: one probed with anti-pERK and the other with anti-ERK. This strategy ensured that all samples from a single experiment (e.g., Control and Yoda1-treated groups) were analyzed under identical conditions, with staining and imaging performed together at the same exposure. To enhance clarity, we have provided this data as __uncut, full-length __as Supplemental Figure 7 (Figure S7) in the revised revision.

      ********************************************

      Minor comments:

      1. Reference list should be checked for completeness; some citations lack journal/ volume/ page/ year details. – We have corrected the references.
      2. An 'overexposed' version of the image selected for centrioles in Figure 5F might be included with the Chibby-BFP at the same level as in the other figures. At present, the Yap KD cell in the image appears to have normal centrioles; this is potentially confusing, even though the authors clearly explain the matter in the text. – __We have added a new panel to Fig. 5F to avoid confusion.

      __ 3. It might be clearer to present injected/ uninjected in the same orientation in Fig. 6A and B. – __Unfortunately, that is not possible because the injected and uninjected sides are left and right, and they cannot be in the same orientation.

      __ 4. Figure 7B lacks the schematic described in the figure legend. – We have removed the Schematic sentence from the figure legend. That was an error on our side. Thank you for catching it.


      Reviewer 3


      1. Abstract: "how MCCs regulate centriole/cilia numbers remains a major knowledge gap" overstates the field; please soften to reflect recent advances (mechanics/apical area scaling; PIEZO1 implication). – We changed the text to “incompletely understood”.
      2. GsMTx4 rationale: State that GsMTx4 is a spider venom peptide that inhibits cationic mechanosensitive channels (including PIEZO1) and justify its use alongside Yoda1.– GsMTx4 was used in the previous manuscript, and its use was justified there. Here, we are only comparing the results. However, we have added a sentence describing what GSMTx4 is. We have also included a sentence explaining the use of Yoda1. “GsMTx4, a spider venom peptide used in our previous study, inhibits cationic mechanosensitive channels, including Piezo1.”

      “For this experiment, we used the Piezo1 channel-specific chemical agonist, Yoda1, to increase the sensitivity of Piezo1 and upregulate calcium entry into cells”

      Timeline statement: "Centriole amplification to migration and apical docking takes ~4-5 h (personal observation)" is not appropriate; either cite time lapse literature or include your own time lapse data.– We have added a reference that showed imaging for 2 hours, but it was not enough to capture the entire process from intercalation to maturation, so we also kept “personal observation” still in the manuscript. We are unaware of any study that has done time-lapse imaging for 4 hours to capture the entire process of centriole amplification.

      Redundancy: The description of Yoda1 as a channel specific agonist is repeated; keep only once.- Removed

      "WT yap1 GFP construct previously used by Dr. Lance Davidson ..." should move construct description to Methods and keep only the citation in Results.– We moved it to Methods.

      "(Unpublished data; Dr. Mahjoub)" should be removed unless data are shown.- Removed

      Replace "as shown previously in our eLife paper" with "as we previously showed or shown previously (Kulkarni et al., 2021)".– We have made the change.

      The two hypotheses for how Foxj1 could regulate number under tension (actin remodeling vs. transcriptional control of amplification genes) belong in the Discussion unless tested. Moreover, the part on the discussion on yap sequestration by apical actin and the two possibilities presented also should go do discussion. – We have moved both to the discussion section.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Reviewer 3

      1- The hypothesis about the centriole pool of Piezo as the mechnosensor for centriole number regulation is very exciting and novel. Can localization controlled variants be used to test whether a centriole associated pool directly senses tension for number control (for example, centrosome targeted PIEZO1 via a PACT tag). Alternatively, broad cellular Ca sensors (GcaMP) or centrosome proximal Ca sensors (e.g., PACT GCaMP) can be used detect local calcium microdomains during tethering or Yoda1 treatment.

      RESPONSE: We appreciate the reviewer's curiosity and excitement; however, these experiments will not alter the conclusion of this paper and will be part of the next study, which aims to delve deeper into how different pools of Piezo1 at centrioles versus cell junctions function in MCCs. To that point, we had thought about these experiments. As mentioned earlier, the Piezo1 coding sequence is exceptionally large (~7.5 kb)____, and repeated attempts by our group to generate and express stable, translatable transcripts have been unsuccessful. Thus, the idea of centrosome-targeted PIEZO1 via a PACT is very exciting; however, it is not technically feasible. Beyond size, PIEZO1 is a trimeric, large plasma-membrane mechanosensitive channel that requires proper ER processing and bilayer incorporation. PACT localizes cargo to the centriole/pericentriolar material, not a membrane compartment; thus, a PACT-anchored PIEZO1 would be membrane-mismatched and almost certainly nonfunctional even if expressed/

      Second, Centrosome-proximal GCaMP (PACT-GCaMP) would show correlation, not causation. This experiment does not address the question “centriole pool of Piezo as the mechanosensor for centriole number regulation”. It will only show if the Ca2+ influx is happening at the basal bodies, but not whether and how that Ca2+ is essential for centriole amplification. For this purpose, we will need to find a way to block Ca2+ influx specifically at basal bodies, rather than junctions, which will require extensive controls.

      We do not claim that any specific Piezo1 or Ca2+ pool is critical for controlling centriole number and thus the suggested experiment would not alter the manuscript's conclusions. We therefore view the above as exciting future directions rather than prerequisites.

      ********************************************

      2- Because the proposed pathway is tension-sensing and YAP pathway is tightly linked to the actin cytoskeleton, the role of actin cysoskeleton in the proposed pathway should be tested directly. The authors mention different hypothesis around actin but has not tested them in the manuscript. For example, actin-depedent sequestration of Yap at the apical surface is intriguing. Does actin polymerization induced by drugs release Yap from the apical surface?

      RESPONSE: We would like to thank the reviewer for their suggestion. As per the reviewers' suggestion, we have moved this section to discussion, stating that “In the future, we plan to address this question by examining how Yap is sequestered by apical actin.”.

      However, we appreciate the reviewer’s enthusiasm and would like to share some experiments we are thinking/planning of to test the hypothesis.

      We plan to examine if the actin polymerization or contractility is responsible for Yap sequestration/release from the apical surface with the following experiments: 1) if the Yap is displaced by Jasplakinolide treatment, which stabilizes filamentous actin, 2) use of ROCK inhibitor to decrease contractility in the absence or presence of Yoda1, 3) Use genetic constructs such as Shroom3 to increase ROCK-mediated contractility to observe changes in Yap localization and dynamics.

      Although these experiments are interesting, they do not alter the conclusion of the current manuscript, and they represent future directions for our research.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript investigates how mechanical tension is transduced into centriole amplification in Xenopus multiciliated cells (MCCs). Building on prior work that centriole number scales with MCC apical area and that this scaling depends on PIEZO1, the study proposes that MCCs repurpose a canonical mechanochemical axis-PIEZO1 → Ca²⁺/PKC → ERK1/2 → YAP → Foxj1-to regulate centriole number rather than mitosis. The authors use tethered vs. untetheredanimal cap explants to modulate tissue tension, combine pharmacologic perturbations with genetic loss of function and rescue, quantititative image analysis and present a model in which tension gated PIEZO1 activates ERK/YAP, influences Foxj1, and tunes centriole number in MCCs.

      The manuscript tackles an important and timely problem with clear disease relevance. It major advance is their presented model that posits that post mitotic MCCs repurpose a canonical mechanotransduction module to regulate organelle number rather than proliferation. It is a conceptually strong study addressing an important problem with a clean mechanical paradigm. However, to support the central claim that centriole number control is a specific, direct consequence of the PIEZO1-Ca²⁺-ERK/YAP pathway within MCCs, the revision should establish specificity and causality and provide experimental data for some of the major conclusions as detailed below. Addressing these points are critical to support the mechanistic conclusions and impact.

      Major concerns:

      1) The presented data do not yet establish a specific, direct pathway linking mechanotransduction to centriole number, because the molecular players tested (PIEZO1, Ca²⁺, PKC, ERK, YAP, Foxj1) are highly pleiotropic. As such, the observed centriole number phenotypes, and some of the major conclusions, could be indirect. It is therefore critical to test the specificity and causality of the proposed pathway. This could be done with the authors' own strategies and/or with the following potential approaches:

      • Genetic dependency and sufficiency tests: It could be shown that Yoda1 has no effect in PIEZO1 loss-of-function MCCs, and that wild-type PIEZO1, but not conductance-dead PIEZO1 pore mutants restores Yoda1 responsiveness across centriole number, pERK, and YAP readouts. For example, PIEZO1 C terminus was shown to govern Ca²⁺ influx and ERK1/2 activation. Comparing full length PIEZO1 with a C terminal deletion in MCC restricted rescue; loss of rescue of centriole amplification and ERK/YAP activation with the C terminal deletion can provide a genetics anchored specificity test beyond broad inhibitors.

      • Downstream bypass/rescue experiments: In PIEZO1 loss-of-function or BAPTA conditions, can enforcing MEK/ERK activation or YAP rescue centriole number defect? Conversely, can MEK inhibitors block Yoda1-induced effects.

      2) The hypothesis about the centriole pool of Piezo as the mechnosensor for centriole number regulation is very exciting and novel. Can localization controlled variants be used to test whether a centriole associated pool directly senses tension for number control (for example, centrosome targeted PIEZO1 via a PACT tag). Alternatively, broad cellular Ca sensors (GcaMP) or centrosome proximal Ca sensors (e.g., PACT GCaMP) can be used detect local calcium microdomains during tethering or Yoda1 treatment.

      3) Because the proposed pathway is tension-sensing and YAP pathway is tightly linked to the actin cytoskeleton, the role of actin cysoskeleton in the proposed pathway should be tested directly. The authors mention different hypothesis around actin but has not tested them in the manuscript. For example, actin-depedent sequestration of Yap at the apical surface is intriguing. Does actin polymerization induced by drugs release Yap from the apical surface?

      4) Image quantification and analysis must be described in greater detail in the Methods section, as they are central to the major conclusions of the manuscript. For example, the authors should explain how nuclear, cytoplasmic, and centriole segmentation were performed, and how relative protein levels in the nucleus versus the cytoplasm (e.g., YAP, volume- or area-based) were quantified. Specifically, the thresholds and segmentation criteria applied to different cellular structures under various conditions, as well as the use of Imaris and other software, should be clearly detailed.

      5) PIEZO1 mRNA was shown to incrase in a Foxj1 linked feedback loop. Does this increase translate into an increase in total protein levels?

      6) Is the proposed signaling cascade active in mammalian multiciliated cells (e.g., airway epithelium). If possible, testing this by using one of the major players of the pathway as a readout such as as ERK phosphorylation, YAP nuclear localization in mammalian MCCs will reveal whether regulation of centriole number through this pathway is conserved and would strengthen the generality.

      7) Throughout the results section, there are multiple times where authors raised specific hypothesis about their data (e.g. foxj1 regulation of number control, apical actin/YAP). However, they have not tested them. These hypothesis are very exciting and if possible, testing experimentally, would strengthen the conclusions associated with them.

      Minor concerns:

      1) Abstract: "how MCCs regulate centriole/cilia numbers remains a major knowledge gap" overstates the field; please soften to reflect recent advances (mechanics/apical area scaling; PIEZO1 implication).

      2) MCC vs non MCC identification (Fig. 1): Clarify how non MCCs were distinguished from MCCs (e.g. markers/criteria).

      3) GsMTx4 rationale: State that GsMTx4 is a spider venom peptide that inhibits cationic mechanosensitive channels (including PIEZO1) and justify its use alongside Yoda1.

      4) Timeline statement: "Centriole amplification to migration and apical docking takes ~4-5 h (personal observation)" is not appropriate; either cite time lapse literature or include your own time lapse data.

      5) Redundancy: The description of Yoda1 as a channel specific agonist is repeated; keep only once.

      6) "WT yap1 GFP construct previously used by Dr. Lance Davidson ..." should move construct description to Methods and keep only the citation in Results.

      7) "(Unpublished data; Dr. Mahjoub)" should be removed unless data are shown.

      8) Add the Kintner group reference linking motile cilia number and centriole number in Xenopus MCCs.

      9) Replace "as shown previously in our eLife paper" with "as we previously showed or shown previously (Kulkarni et al., 2021)".

      10) The two hypotheses for how Foxj1 could regulate number under tension (actin remodeling vs. transcriptional control of amplification genes) belong in the Discussion unless tested. Moreover, the part on the discussion on yap sequestration by apical actin and the two possibilities presented also should go do discussion.

      Significance

      This manuscirpt dissects Piezo1-mediated mechanotransduction to regulation of centriole number in Xenopus multiciliated cells (MCCs) via Ca²⁺, ERK/YAP, and Foxj1. While Piezo1 and its downstream effectors have been implicated broadly in mechanosensation, cellular tension responses, and transcriptional regulation, their specific role in centriole nubmer control in MCCs has been unknown By integrating pharmacological manipulation, genetic perturbation, and functional readouts, the authors demonstrate that this pathway directly influences centriole number.

      The findings extend published knowledge in two main ways:

      (1) they connect a mechanosensitive ion channel to the transcriptional program governing Foxj1 expression and multiciliation, a mechanistic link not previously defined, and

      (2) they highlight the pleiotropic yet coordinated nature of Piezo1 signaling in organelle biogenesis. This work will be of broad interest to cell and developmental biologists studying ciliogenesis, epithelial differentiation, and mechanotransduction, as well as to biomedical researchers interested in multicilaited cells and ciliopathies. By situating a well-studied mechanosensor within the context of MCC biology, the study opens new directions for understanding how tissue-level forces shape organelle number control and function.

      At the same time, the impact of the study is weakened by concerns regarding the causability and specificity of the pathway, since the signaling components examined are highly pleiotropic and it remains challenging to separate direct effects on centriole number from broader cellular consequences. The causal relationships among Piezo1 activity, downstream signaling, and Foxj1 expression require stronger substantiation, and the extent to which this pathway operates in mammalian multiciliated cells remains an open question. Addressing these limitations would strengthen the robustness, generality, and translational relevance of the conclusions.