7,059 Matching Annotations
  1. Last 7 days
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ In this manuscript, the authors describe the discovery of a molecular regulator of the immune transcriptional program, which is activated by intestinal distension upon bacterial colonization of the C. elegans intestine. Taking advantage of the fact that inhibition of aex-5 is known to cause intestinal distension and a C-type lectin gene clec-60 as a marker for the immune response to intestinal distension (clec-60p::gfp), the authors performed a forward genetic screen for suppressors of the immune response activation. Of the two mutants isolated, they focused on the stronger suppressor, which corresponded to a cysteine-type DUB, the Ubiquitin Specific Peptidase-14 (usp-14). Through rescue experiments, phenocopy analyses, and quantitative RT-PCR, they validated usp-14 as the causal gene and initiated characterization of its role in immune response activation. To this end, the authors investigated the tissue of action, identifying the intestine as the tissue in which usp-14 mediates the regulation of the immune response. Through transcriptomic analyses, they found that the signalling pathway likely regulated by usp-14 in response to intestinal distension is the Wnt pathway, as they have observed reduction in the transcriptional level of some of the Wnt pathway components in usp-4(tm1481), in response to infection with S. aureus. Additionally, transcriptomic data indicate that usp-14 plays a role in immunity regulation even in the absence of infection. Based on these findings, the authors propose that usp-14 has a dual role in immune regulation: one in surveillance immunity, preventing overactivation of immune responses, and another as a mediator of pathogen-induced responses, such as those triggered by P. aeruginosa or S. aureus. The experiments are rigorous and the results robust; however, some points would benefit from further investigation or clarification. __Response: We thank the reviewer for an excellent summary of our work and for the valuable feedback.

      Comment: The expression domain of usp-14 appears to be quite expanded based on single cell RNAseq data (e.g. PMID: 28818938) therefore it is likely that the transgenes used for expression analysis are lacking key regulatory information. Alternative methods like smFISH would be more appropriate to characterise the spatiotemporal pattern of usp-14 expression in more detail. Response: We thank the reviewer for this valuable suggestion. In the original version of the manuscript, we used a 714 bp region upstream of the usp-14 start codon to generate the transcriptional reporter. In the revised manuscript, we reconstructed the reporter using a longer 1924 bp upstream promoter region together with a portion of exon 1. Using this updated reporter, we observed substantially broader expression of usp-14, particularly during the early larval stages. These results are described on page 6, lines 148-153: “We next examined the spatiotemporal expression pattern of usp-14 in C. elegans. To this end, we generated transgenic worms expressing GFP under the control of the usp-14 promoter (usp-14p::gfp). During early larval development, usp-14 was broadly expressed across multiple tissues (Figure 3A). However, in L4 larvae and adult animals, expression became more restricted and was predominantly observed in the intestine and a subset of neuronal cells. Notably, both intestinal and neuronal expression persisted throughout development (Figure 3A).

      Comment: __The mutation mapped in usp-14(jsn19) is a missense mutation (E122K) that suppresses the immune response to a degree comparable to the usp-14(tm1481) deletion allele. However, the authors do not show the functional domains in Fig. 1E potentially affected by this missense mutation. __Response: We have now updated Figure 1E to include the functional domains of USP-14 and mapped both the usp-14(jsn19) missense allele and the usp-14(tm1481) deletion allele onto the protein schematic.

      Comment: __How USP-14 regulates Wnt and how Wnt signalling relates to activation of immune responses is not fully supported. Are the Wnt components mentioned in the study induced specifically in the intestine upon infection and does USP-14 act in the intestine in the context of this regulation? How do the authors interpret that both Wnt ligands and receptors are induced ? Does Wnt signalling appear as a GO term in the transcriptomic analysis? The authors can include Wnt signalling components in the analysis of the transcriptomic results. __Response: We thank the reviewer for these insightful comments. Previous studies have shown that the Wnt pathway components examined in our study are induced in the intestine upon infection and function within the intestine to regulate host defense against bacterial pathogens (PMID: 29768179; PMID: 36323254).

      We did not observe significant enrichment of Wnt signaling terms in the GO analysis of our transcriptomic dataset. We believe this is likely due to the stringent thresholds used for differential expression analysis (fold change > 2 and p At present, the precise mechanism by which USP-14 regulates Wnt pathway components remains unclear. One possibility is that USP-14 influences Wnt signaling indirectly through additional substrates or interacting proteins that regulate transcriptional outputs. We have now clarified this point in the Discussion (page 13, lines 344–349): “These observations raise the possibility that additional USP-14 substrates or interacting proteins modulate transcriptional outputs downstream of intestinal distension. Future studies aimed at identifying the direct substrates of USP-14 and defining how USP-14 interfaces with neuronal ACC-4 signaling and other distension-responsive pathways will provide important mechanistic insight into how intestinal distension is coupled to innate immune activation.

      Regarding the simultaneous induction of Wnt ligands and receptors, we interpret this as a potential amplification or reinforcement mechanism that enhances Wnt/β-catenin signaling during infection-induced intestinal distension. However, further studies will be required to determine the mechanistic significance of this coordinated transcriptional regulation.

      Comment: __Overall, in most of the figures, the micrographs are in general quite dark and exhibit poor contrast between signal and background, particularly in Fig. 1, panels B and J, and Fig. 2, panels B and F (upper rows). Even though these panels are intended to show absence of response, the outlines of the worms are difficult to discern. __Response: We thank the reviewer for the feedback. We have now improved the image presentation throughout the manuscript by either increasing the intensity or adding dotted outlines to more clearly indicate worm positions.

      Comment: __In Figure S3, panels A and B, the pmk-1(km25); usp-14(tm1481) animals subjected to aex-5 RNAi show some level of fluorescence/response induction comparable to pmk-1(km25) alone. This observation is not discussed in the text. __Response: We have now discussed this observation in the text. These results are described on page 9, lines 244-248: “Although pmk-1(km25);usp-14(tm1481) worms displayed relatively higher GFP levels than usp-14(tm1481) single mutants upon aex-5 RNAi treatment, this effect likely reflects the elevated basal GFP expression observed in pmk-1(km25) mutants (Figure S4B). Importantly, pmk-1(km25);usp-14(tm1481) animals still exhibited significantly lower GFP levels than pmk-1(km25) single mutants.

      Reviewer #1 (Significance (Required)): __ __Comment: __The work is interesting because it expands some previous work in the field demonstrating immune response induction as a consequence of intestinal distension even in the absence of bacterial infection. This is known to be mediated by the neuronal acetylcholine receptor ACC-4, which signals to the intestine where it regulates immune genes via the Wnt pathway. However, how USP-14 relates to ACC-4 is currently unclear and whether USP-14 function is really required in the intestine to control Wnt signalling is not demonstrated. The authors should include a model to describe how their findings relate to the previous literature and how USP-14 may link mechanistically to Wnt signalling pathway activation. __Response: We thank the reviewer for this insightful comment. We agree that the relationship between USP-14, ACC-4, and Wnt signaling requires further clarification. As suggested by the reviewer, we have now included a model summarizing the current understanding of intestinal distension-induced immune activation and integrating our findings with previous literature (Figure 6H).

      Comment: __It remains also unclear whether usp-14 is the only deubiquitinase involved in intestinal distension-induced signalling via the Wnt pathway, or whether other paralog usp genes might also contribute to regulation of immune-responsive transcription. Notably, several mammalian deubiquitinases have established roles in cancer suppression and inflammatory response and innate immunity in other systems so this would increase the potential significance of the work. __Response: We thank the reviewer for this valuable suggestion. To systematically examine whether additional DUBs contribute to intestinal distension-induced immune activation, we performed an RNAi screen targeting all DUBs available in the Ahringer RNAi library using the aex-5(sa23);clec-60p::gfp reporter strain. Among the DUBs tested, knockdown of usp-14 produced the strongest suppression of clec-60p::gfp expression. Although knockdown of usp-5 also partially suppressed GFP induction, usp-5 RNAi did not affect survival during P. aeruginosa infection, suggesting that usp-5 is not required for host defense under these conditions. Together, these findings identify USP-14 as the major DUB required for intestinal distension-induced immune activation in our experimental system. These results are now included in Figure 1G, H, and Figure S2.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ Summary C. elegans are soil-dwelling nematodes that feed on bacteria and fungi and thus must be able to distinguish between innocuous and pathogenic species of microbes to survive. Though they lack adaptive immunity, these animals have an ancient version of an innate immune system that has no circulating sentinel or phagocytic cells yet can still mount a response to pathogen exposure. A consequence of the mode of infection of some ingested bacterial pathogens is intestinal distension which by itself, even in the absence of pathogens, is sufficient to trigger the expression of genes encoding immune effectors, including proteins that are bactericidal. The complete mechanistic scheme connecting intestinal distension to the expression of immunity genes has not been resolved, motivating the authors to perform a forward genetic screen for additional components of this pathway. One mutant that the authors isolated was usp-14, encoding an evolutionarily conserved deubiqutinating enzyme. Functional analysis revealed that usp-14 confers protection from microbial pathogens and that the intestine is its primary site of action for its role in host defense. The authors' data indicate that while USP-14 regulates the expression of innate immunity genes that are induced by intestinal distension, surprisingly it functions independently of several canonical innate immune signaling pathways, including the pmk-1/p38 MAPK pathway. Instead, USP-14 appears to act through Wnt signaling to regulate immune effectors by upregulating the expression of several components of that pathway, including the C. elegans ß-catenin ortholog bar-1. This places usp-14 within a gut-brain axis previously shown to control the C. elegans innate immune response through acetylcholine-mediated activation of Wnt signaling. The authors' findings provide new mechanistic insight to this pathway and add to the understanding of ubiqutination as an immune regulatory module. __Response: We thank the reviewer for providing an excellent summary of our work.

      Major comments __1. There are three types of experiments in which the authors use the same set of controls across several different figure panels, as stated in the legend to Figure 2. First, when quantifying GFP levels of clec-60::gfp in RNAi-treated animals, the authors use the same clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls for Fig. 1K, 2C, and 2G. For infection assays with S. aureus NCTC8325, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2E are the same as the ones used in Fig. 1M. Similarly, for infection assays with P. aeruginosa PA14, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2I is the same as was used for Fig 1I. In each case, if the authors in fact collected all of the data for each strain that they studied at the same time but then chose to parse larger datasets into separate figure panels to make it more clear to the reader, then this approach is valid but the authors need to explicitly state that this is what they did. However, if the data pertaining to the control strains were collected at a different time or if it comes from a separate biological replicate, then re-using data from the controls is not appropriate because it would not accurately reflect the specific conditions of the experiment to which the data are being compared. If this is indeed the scenario, then the authors will need to repeat these experiments and include the appropriate control in each iteration. __Response: While preparing the manuscript, these experiments were performed simultaneously. Therefore, all panels that share controls have results from experiments performed simultaneously and represent the same biological replicate. We have added this additional information in the relevant figure legends.

      Comment: __2. From the legends describing figure panels that include data pertaining to clec-60p::gfp expression levels as assessed by fluorescence microscopy it seems that, in general, the authors measured GFP fluorescence in about 30 animals to produce quantitative data. How many biological replicates of these types of experiments were carried out? This is not explicitly stated in the section describing fluorescence imaging in the Methods section. Following the description of their methodology regarding statistical analysis of survival curves from microbial infection assays, however, the authors state that, "[a]ll experiments were performed independently at least three times unless otherwise noted." Does this statement apply to microscopy or only to experiments involving infection assays? If the data reporting quantitation of GFP signal is based on only 30 animals, then additional biological replicates are necessary, along with appropriate statistical analyses. __Response: The quantified GFP fluorescence data are derived from three independent biological replicates. In each experiment, we typically imaged and quantified approximately 10 worms per condition, yielding a total of ~30 worms analyzed per genotype or treatment across all replicates (except Figure S1B, where we had two independent replicates). We have added the number of experiments in the figure legends for these data.

      Comment: __3. The authors have made all of the RNASeq data publicly available on the Sequence Read Archive, and they include data from several pairwise comparisons for differential gene expression analysis in their supplemental files. One of the most important facts to come out of the authors' Gene Ontology analyses of their RNASeq data is that the genes that are upregulated in a usp-14-dependent manner upon intestinal distension are enriched for those whose products play a role in innate immunity/host defense. The authors should say more about these genes. Are there any commonalities between them with regard to function? Are any of them targets of transcription factors that are known to function in C. elegans innate immunity? If so, this could provide clues as to what the substrates of USP-14 might be. Importantly, the specific identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense should be revealed in a supplemental file, and designated as being dependent on or independent of usp-14 for their expression during intestinal distension. __Response: We thank the reviewer for this insightful suggestion. We have now expanded the Results section to describe the functional categories enriched among the USP-14-dependent intestinal distension-induced immune genes, including C-type lectins, ShK toxin domain-containing proteins, and lysozymes (page 7, lines 194-196).

      In addition, we compared our transcriptomic dataset with previously published transcription factor-regulated gene sets using WormExp analysis and identified a substantial overlap with genes regulated by the GATA transcription factor ELT-2. These new analyses are described on page 7, lines 197-210: “To identify transcription factors potentially involved in intestinal distension-induced immune activation, we performed transcription factor enrichment analysis using WormExp on genes upregulated in N2 worms following aex-5 RNAi treatment. This analysis revealed a substantial overlap between aex-5 RNAi-induced genes and genes regulated by the GATA transcription factor ELT-2 (Figure S3D). We next examined whether USP-14-dependent immune genes overlapped with ELT-2-dependent immunity genes induced by intestinal distension. To this end, we identified innate immune genes common to both ELT-2-regulated gene sets and aex-5 RNAi-induced genes. Strikingly, these ELT-2-dependent intestinal distension-induced immune genes showed substantial overlap with USP-14-dependent immune genes (Figure S3E and Table S5), suggesting that USP-14 may regulate distension-induced immunity, at least in part, through ELT-2-dependent transcriptional programs. Consistent with this possibility, RNAi-mediated knockdown of elt-2 did not further increase the susceptibility of usp-14(tm1481) worms to P. aeruginosa infection relative to wild-type worms (Figure S3F), supporting a model in which USP-14-mediated immune responses require ELT-2 activity.

      Finally, we have created a new table (Table S5) that specifies the identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense, for USP-14-dependent and independent genes.

      Comment: __4. The authors' data suggest that in response to bacterial infection USP-14 upregulates the expression of bar-1, along with other components of the Wnt signaling pathway, which in turn upregulates innate immunity genes. This could be further substantiated by directly demonstrating that there are USP-14-regulated innate immunity genes whose induced expression in the presence of microbial pathogens also requires bar-1. Along those lines, an initial test would be to assess clec-60p::gfp expression in bar-1 animals versus bar-1;usp-14 double mutants, similar to the experiment whose results are reported in Fig. S4. If generating the bar-1;usp-14 double mutant is not feasible, then RNAi could be used to knockdown bar-1 expression in clec-60p::gfp;usp-14(tm1481) animals. To expand this analysis, the expression of the six innate immunity genes shown to be regulated upon intestinal distension in usp-14-dependent manner could be measured in the presence and absence of intestinal distension or microbial infection in bar-1 and bar-1;usp-14 animals by qRT-PCR. At a minimum, the authors should conduct a bioinformatics analysis to compare the USP-14-regulated innate immunity genes identified in their RNAseq studies to lists of known BAR-1 transcriptional targets to look for potential overlap. __Response: We agree that extending these analyses to qRT-PCR experiments examining additional immune genes would be informative. However, both bar-1 mutants and bar-1 RNAi-treated worms exhibited severe developmental and physiological defects, including sick and dead animals during development, likely reflecting the pleiotropic developmental roles of BAR-1. Although fluorescence imaging and survival assays could be performed by selectively transferring surviving adults, we were concerned that bulk collection of worms for qRT-PCR analyses would introduce confounding effects arising from developmental defects and reduced viability.

      To further address the reviewer’s suggestion, we carried out a comparative analysis between USP-14-dependent intestinal distension-induced immune genes and previously identified BAR-1-dependent immune genes. Although transcriptome-wide datasets for BAR-1-dependent pathogen-induced immune genes are not currently available, an earlier study identified seven immune response genes regulated by BAR-1 during infection (PMID: 18981407). We found that six of these genes overlap with the USP-14-dependent intestinal distension-induced immune genes identified in our study. These analyses have now been added to the Results section and included in Table S5.

      Comment: __5. While in their Discussion section the authors mention evolutionarily conserved roles for protein ubiquitination as means of immunomodulation, there are few if any comments regarding ubiqutination as a regulatory scheme in C. elegans innate immunity or how their findings enhance our understanding of this phenomenon. Ubiquitination affects C. elegans immunity at multiple levels, from avoidance behavior to gene regulation, and it seems appropriate for the authors to address this in order to more fully contextualize their findings. __Response: We thank the reviewer for the suggestion. We have now added a new paragraph to the Discussion that places our findings in the context of the existing literature on ubiquitination, deubiquitination, and innate immunity in C. elegans. The discussion is added on pages 11-12, lines 299-312: “Although ubiquitin-mediated signaling has emerged as a central regulator of innate immunity across metazoans (Jiang & Chen, 2011; Mello-Vieira & Dikic, 2026), the contribution of DUBs to host defense in C. elegans remains poorly understood. Previous studies in C. elegans have shown that ubiquitin-dependent processes regulate diverse aspects of immunity, including immune surveillance, xenophagy, and pathogen tolerance (Garcia-Sanchez et al, 2021). Perturbations in proteasome function have also been shown to activate surveillance immunity (Ghosh & Singh, 2026; Troemel et al, 2026), highlighting the importance of ubiquitin-associated pathways in sensing pathogen-induced cellular damage. However, most prior studies have focused on ubiquitin ligases, proteasome-associated pathways, or global ubiquitin signaling rather than on specific DUBs directly regulating antibacterial immune responses. To our knowledge, our study provides the first direct evidence that a specific DUB regulates antibacterial innate immunity in C. elegans. Thus, our findings establish USP-14 as a previously unrecognized regulator of host defense and identify deubiquitination as an important regulatory layer in intestinal distension-mediated immunity.

      __Minor comments __1. In the Results section, the authors state that "[k]nockdown of cec-10 led to only a marginal decrease in survival during P. aeruginosa infection" (lines 92 and 93) and that cec-10 "has minimal impact on C. elegans survival during infection" (lines 93 and 94). However, as reported in Supplemental Table 5 the magnitude of the calculated difference in mean survival time between animals treated with RNAi targeting cec-10 and untreated control animals (-20% to -24% and statistically significant in 3/3 replicates) closely approximates the difference in mean survival between usp-14 mutants and controls (-19% to -28% and statistically significant in 3/3 replicates), which the authors clearly find to be significant. If by this metric usp-14 is important for host defense, then so too is cec-10. In light of this, the authors should use different language to describe the impact of cec-10 knockdown on the susceptibility of C. elegans to microbial infection and the potential role of cec-10 in immunity.

      Response: We chose not to pursue cec-10 further primarily because it lacks a clear human homolog and because the mutant exhibited reduced expression of the co-injection marker, raising the possibility of broader transgene-related effects. We have modified the text on page 4, lines 93-97: “Knockdown of cec-10 resulted in a significant reduction in survival during P. aeruginosa infection (Figure S1C). However, we did not pursue cec-10 further for two reasons: (i) cec-10(jsn20) mutants exhibited a modest but significant reduction in the myo-2p::mCherry co-injection marker (Figure 1D), raising the possibility of broader transgene-related defects, and (ii) cec-10 lacks a clear human homolog.

      Comment: __2. All of the micrographs in Fig. 1B appear very dark. The GFP expression in the control animals appears dim, making it difficult for the reader to compare the signal in those animals to the GFP expression levels in the mutants. I recommend adjusting the brightness level in an equivalent manner across all of the micrographs to account for this. __Response: We have increased the brightness of all the images, as suggested by the reviewer.

      __Comment: __3. Fig. 1E depicts a gene structure diagram for usp-14 with the position of the point mutation in the jsn19 allele isolated in the authors' forward genetic screen indicated by the amino acid substitution symbol drawn over the second exon. Instead of mixing gene- and protein-level information about the jsn19 allele, I recommend replacing the gene structure diagram with a domain structure diagram of the USP-14 protein that depicts the conserved C19 peptidase and ubiquitin-like domains. The relative position of the E122K substitution should still be noted. __Response: __We have now updated Figure 1E to include the functional domains of USP-14 and mapped both the usp-14(jsn19) missense allele and the usp-14(tm1481) deletion allele onto the protein schematic.

      Comment: __4. Since all of the information in Fig. 1F appears elsewhere in the text, I recommend eliminating this panel. __Response: We have removed it.

      Comment: __5. Regarding the RNAseq analysis, the authors state that 1241 genes are upregulated upon aex-5 knockdown (line 162). The authors then ask which of these genes are regulated by usp-14 in the context of intestinal distension and find that 633 are upregulated a usp-14-dependent manner when aex-5 is targeted by RNAi and that 595 are upregulated even in the absence of usp-14 (Fig. 3D). This accounts for 1228 genes in total, not 1241. Can the authors explain this discrepancy? __Response: We thank the reviewer for carefully noting this discrepancy. The difference arises from the criteria used to classify genes into the categories shown in Figure 5D (previously Figure 3D). Specifically, genes uniquely upregulated in usp-14(tm1481) worms were defined as genes that were either exclusively induced in usp-14(tm1481) worms or expressed at levels more than 2-fold higher in usp-14(tm1481) worms compared to N2 worms. During this classification, 13 genes that were initially identified as upregulated in N2 worms following aex-5 RNAi were found to be expressed at levels more than 2-fold higher in usp-14(tm1481) worms than in N2 worms (Table S4). These genes were therefore reassigned to the “usp-14(tm1481)-specific” category in the Venn diagram. Consequently, the total number of genes represented in the Venn diagram becomes 1228 instead of 1241. To clarify this point, we have now added an explanation to the figure legend.

      Comment: __6. For the sake of clarity, in the legend to Fig. 3D I recommend expanding the description of the categories of genes depicted in the Venn diagram by using the same language as in the first worksheet of Supplemental Table 4. __Response: We thank the reviewer for the suggestion. We have now added these details to the legend of Figure 5D (previously Figure 3D). The legend reads: “(D) Venn diagram showing the overlap between genes upregulated upon aex-5 RNAi in N2 and usp-14(tm1481) worms. The GO analyses for the biological processes of unique and common genes are shown. USP-14-dependent genes were defined as genes that were either exclusively upregulated in N2 worms or expressed at levels greater than 2-fold higher in N2 worms than in usp-14(tm1481) worms. USP-14-independent genes were defined as genes upregulated in both N2 and usp-14(tm1481) worms with expression differences of less than 2-fold between the two strains. Genes uniquely upregulated in usp-14(tm1481) worms were defined as genes that were either exclusively induced in usp-14(tm1481) worms or expressed at levels greater than 2-fold higher in usp-14(tm1481) worms than in N2 worms. Thirteen genes classified as upregulated in N2 worms were more than 2-fold higher in usp-14(tm1481) worms than in N2 worms (Table S4) and were therefore included in the usp-14(tm1481)-specific category.

      Comment: __7. In Fig. 4B, the authors' annotation indicates that there is a statistically significant difference (**, p __Comment: __8. In Fig. S5, the shade of blue used to represent the data from the nhr-49(nr2041);usp-14(tm1481);clec60p::gfp animals in panel E is different from that used to represent data from the same animals in panel B. This breaks the pattern of all of the other panels of this figure in which the data pertaining to a given phenotype are depicted in the same color. Also, in the symbol key in panel E there is an extra semi-colon before clec-60p::gfp that should be eliminated in the second genotype notation. __Response: We thank the reviewer for carefully examining the figure and for bringing these issues to our attention. We have made the changes.

      Comment: __9. The authors' data show that USP-14 regulates bar-1 expression, and in the Discussion section they mention that in mammals beta-catenin is a substrate of USP14. Can the authors comment on the possibility of/evidence for BAR-1 autoregulation in C. elegans and the prospect of it being facilitated by USP-14? This could be a minor point to add to the Discussion. __Response: In both contexts, USP-14 appears to stabilize BAR-1 by regulating it at either the transcriptional or post-translational level. However, it is currently unknown whether BAR-1 regulates USP-14 expression and thereby participates in an autoregulatory mechanism. Nevertheless, we have added to the Discussion that USP14 may regulate the Wnt pathway through both transcriptional and post-translational mechanisms, depending on the biological context. __Reviewer #2 (Significance (Required)): __ The study described in this manuscript ties in to the findings from two prior genetic screens carried out in C. elegans that aimed to identify immune regulators (Ren et al., Cell Reports, 2022 and Labed et al., Immunity, 2018). Though their strategies differed, both of these previous studies uncovered a role for acetylcholine receptors in modulating the response to ingested microbial pathogens, especially when infection is associated with intestinal distension, indicating that a neuron-to-gut axis controls innate immunity in C. elegans. Labed and colleagues were the first to show that activation of this pathway results in the upregulation of genes encoding Wnt signaling pathway components, including the worm ortholog of beta-catenin called bar-1, which are necessary for the expression of immune effectors in the intestine. The Labed study also revealed that protein ubiquitination could contribute to regulating host defense gene induction because knockdown of lin-23, the substate binding subunit of a ubiquitin ligase complex that mediates BAR-1 degradation, results in constitutive expression of clec-60p::gfp, the same transcription reporter used by Ghosh and Singh as a readout for the expression of innate immunity genes. In their screen that revisits the Ren et al. approach, Ghosh and Singh find that another protein implicated in regulating protein stability via ubiquitination status, USP-14, also controls the expression of innate immunity genes in response to intestinal distension. Interestingly, their data indicate that it does so by upregulating bar-1. This discovery therefore adds an element of mechanistic detail regarding the regulation of Wnt signaling in immunity. While the Labed data suggest that ubiquitination may regulate BAR-1 at the post-translational level, Ghosh and Singhs' results indicate a second layer of regulation of bar-1 at the transcriptional level that also appears to involve ubiquitination. In this case, USP-14 is predicted to modulate the ubiquitination status of a yet-to-be-identified substrate that directly or indirectly governs bar-1 expression. The authors' findings thus bring the field closer to having a complete picture of the Ach-Wnt pathway in C. elegans. As they point out in the Discussion section of their manuscript, ubiquitination is an evolutionarily conserved yet complex means of tuning the immune system. The work described here helps to shed light on this important immune regulatory mode and could have implications for aspects of epithelial immunity that are in common to both invertebrates and vertebrates.

      Response: We thank the reviewer for providing such a thoughtful overview of the field and for placing our findings in the context of previous studies on intestinal distension-induced immunity in C. elegans. We also sincerely appreciate the reviewer’s constructive feedback and insightful comments, which have helped us improve the quality and clarity of the manuscript.

      My research interest and specific area of expertise pertains to evolutionarily conserved genetic pathways that control healthspan through affecting cellular resilience later in life. Using C. elegans as a surrogate for aging humans, my group studies age-dependent changes in the activity of regulatory modules that protect older animals from the molecular damage associated with intrinsic and extrinsic sources of cellular stress, with a particular emphasis on microbial infection and oxidative stress.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      C. elegans are soil-dwelling nematodes that feed on bacteria and fungi and thus must be able to distinguish between innocuous and pathogenic species of microbes to survive. Though they lack adaptive immunity, these animals have an ancient version of an innate immune system that has no circulating sentinel or phagocytic cells yet can still mount a response to pathogen exposure. A consequence of the mode of infection of some ingested bacterial pathogens is intestinal distension which by itself, even in the absence of pathogens, is sufficient to trigger the expression of genes encoding immune effectors, including proteins that are bactericidal. The complete mechanistic scheme connecting intestinal distension to the expression of immunity genes has not been resolved, motivating the authors to perform a forward genetic screen for additional components of this pathway. One mutant that the authors isolated was usp-14, encoding an evolutionarily conserved deubiqutinating enzyme. Functional analysis revealed that usp-14 confers protection from microbial pathogens and that the intestine is its primary site of action for its role in host defense. The authors' data indicate that while USP-14 regulates the expression of innate immunity genes that are induced by intestinal distension, surprisingly it functions independently of several canonical innate immune signaling pathways, including the pmk-1/p38 MAPK pathway. Instead, USP-14 appears to act through Wnt signaling to regulate immune effectors by upregulating the expression of several components of that pathway, including the C. elegans ß-catenin ortholog bar-1. This places usp-14 within a gut-brain axis previously shown to control the C. elegans innate immune response through acetylcholine-mediated activation of Wnt signaling. The authors' findings provide new mechanistic insight to this pathway and add to the understanding of ubiqutination as an immune regulatory module.

      Major comments

      1. There are three types of experiments in which the authors use the same set of controls across several different figure panels, as stated in the legend to Figure 2. First, when quantifying GFP levels of clec-60::gfp in RNAi-treated animals, the authors use the same clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls for Fig. 1K, 2C, and 2G. For infection assays with S. aureus NCTC8325, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2E are the same as the ones used in Fig. 1M. Similarly, for infection assays with P. aeruginosa PA14, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2I is the same as was used for Fig 1I. In each case, if the authors in fact collected all of the data for each strain that they studied at the same time but then chose to parse larger datasets into separate figure panels to make it more clear to the reader, then this approach is valid but the authors need to explicitly state that this is what they did. However, if the data pertaining to the control strains were collected at a different time or if it comes from a separate biological replicate, then re-using data from the controls is not appropriate because it would not accurately reflect the specific conditions of the experiment to which the data are being compared. If this is indeed the scenario, then the authors will need to repeat these experiments and include the appropriate control in each iteration.
      2. From the legends describing figure panels that include data pertaining to clec-60p::gfp expression levels as assessed by fluorescence microscopy it seems that, in general, the authors measured GFP fluorescence in about 30 animals to produce quantitative data. How many biological replicates of these types of experiments were carried out? This is not explicitly stated in the section describing fluorescence imaging in the Methods section. Following the description of their methodology regarding statistical analysis of survival curves from microbial infection assays, however, the authors state that, "[a]ll experiments were performed independently at least three times unless otherwise noted." Does this statement apply to microscopy or only to experiments involving infection assays? If the data reporting quantitation of GFP signal is based on only 30 animals, then additional biological replicates are necessary, along with appropriate statistical analyses.
      3. The authors have made all of the RNASeq data publicly available on the Sequence Read Archive, and they include data from several pairwise comparisons for differential gene expression analysis in their supplemental files. One of the most important facts to come out of the authors' Gene Ontology analyses of their RNASeq data is that the genes that are upregulated in a usp-14-dependent manner upon intestinal distension are enriched for those whose products play a role in innate immunity/host defense. The authors should say more about these genes. Are there any commonalities between them with regard to function? Are any of them targets of transcription factors that are known to function in C. elegans innate immunity? If so, this could provide clues as to what the substrates of USP-14 might be. Importantly, the specific identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense should be revealed in a supplemental file, and designated as being dependent on or independent of usp-14 for their expression during intestinal distension.
      4. The authors' data suggest that in response to bacterial infection USP-14 upregulates the expression of bar-1, along with other components of the Wnt signaling pathway, which in turn upregulates innate immunity genes. This could be further substantiated by directly demonstrating that there are USP-14-regulated innate immunity genes whose induced expression in the presence of microbial pathogens also requires bar-1. Along those lines, an initial test would be to assess clec-60p::gfp expression in bar-1 animals versus bar-1;usp-14 double mutants, similar to the experiment whose results are reported in Fig. S4. If generating the bar-1;usp-14 double mutant is not feasible, then RNAi could be used to knockdown bar-1 expression in clec-60p::gfp;usp-14(tm1481) animals. To expand this analysis, the expression of the six innate immunity genes shown to be regulated upon intestinal distension in usp-14-dependent manner could be measured in the presence and absence of intestinal distension or microbial infection in bar-1 and bar-1;usp-14 animals by qRT-PCR. At a minimum, the authors should conduct a bioinformatics analysis to compare the USP-14-regulated innate immunity genes identified in their RNAseq studies to lists of known BAR-1 transcriptional targets to look for potential overlap.
      5. While in their Discussion section the authors mention evolutionarily conserved roles for protein ubiquitination as means of immunomodulation, there are few if any comments regarding ubiqutination as a regulatory scheme in C. elegans innate immunity or how their findings enhance our understanding of this phenomenon. Ubiquitination affects C. elegans immunity at multiple levels, from avoidance behavior to gene regulation, and it seems appropriate for the authors to address this in order to more fully contextualize their findings.

      Minor comments

      1. In the Results section, the authors state that "[k]nockdown of cec-10 led to only a marginal decrease in survival during P. aeruginosa infection" (lines 92 and 93) and that cec-10 "has minimal impact on C. elegans survival during infection" (lines 93 and 94). However, as reported in Supplemental Table 5 the magnitude of the calculated difference in mean survival time between animals treated with RNAi targeting cec-10 and untreated control animals (-20% to -24% and statistically significant in 3/3 replicates) closely approximates the difference in mean survival between usp-14 mutants and controls (-19% to -28% and statistically significant in 3/3 replicates), which the authors clearly find to be significant. If by this metric usp-14 is important for host defense, then so too is cec-10. In light of this, the authors should use different language to describe the impact of cec-10 knockdown on the susceptibility of C. elegans to microbial infection and the potential role of cec-10 in immunity.
      2. All of the micrographs in Fig. 1B appear very dark. The GFP expression in the control animals appears dim, making it difficult for the reader to compare the signal in those animals to the GFP expression levels in the mutants. I recommend adjusting the brightness level in an equivalent manner across all of the micrographs to account for this.
      3. Fig. 1E depicts a gene structure diagram for usp-14 with the position of the point mutation in the jsn19 allele isolated in the authors' forward genetic screen indicated by the amino acid substitution symbol drawn over the second exon. Instead of mixing gene- and protein-level information about the jsn19 allele, I recommend replacing the gene structure diagram with a domain structure diagram of the USP-14 protein that depicts the conserved C19 peptidase and ubiquitin-like domains. The relative position of the E122K substitution should still be noted.
      4. Since all of the information in Fig. 1F appears elsewhere in the text, I recommend eliminating this panel.
      5. Regarding the RNAseq analysis, the authors state that 1241 genes are upregulated upon aex-5 knockdown (line 162). The authors then ask which of these genes are regulated by usp-14 in the context of intestinal distension and find that 633 are upregulated a usp-14-dependent manner when aex-5 is targeted by RNAi and that 595 are upregulated even in the absence of usp-14 (Fig. 3D). This accounts for 1228 genes in total, not 1241. Can the authors explain this discrepancy?
      6. For the sake of clarity, in the legend to Fig. 3D I recommend expanding the description of the categories of genes depicted in the Venn diagram by using the same language as in the first worksheet of Supplemental Table 4.
      7. In Fig. 4B, the authors' annotation indicates that there is a statistically significant difference (**, p<0.01) in the fluorescence signal from clec-60p::gfp in usp-14(jsn19);aex-5(sa23);clec-60p::gfp_EV versus usp-14(jsn19);aex-5(sa23);clec-60p::gfp_bar-1 animals. This is likely a typographical error that should be changed to "ns" to indicate no significant difference in the fluorescence signal between these two groups, which is consistent with what the data show and with the authors' description of these data in the text (lines 211-214).
      8. In Fig. S5, the shade of blue used to represent the data from the nhr-49(nr2041);usp-14(tm1481);clec60p::gfp animals in panel E is different from that used to represent data from the same animals in panel B. This breaks the pattern of all of the other panels of this figure in which the data pertaining to a given phenotype are depicted in the same color. Also, in the symbol key in panel E there is an extra semi-colon before clec-60p::gfp that should be eliminated in the second genotype notation.
      9. The authors' data show that USP-14 regulates bar-1 expression, and in the Discussion section they mention that in mammals beta-catenin is a substrate of USP14. Can the authors comment on the possibility of/evidence for BAR-1 autoregulation in C. elegans and the prospect of it being facilitated by USP-14? This could be a minor point to add to the Discussion.

      Significance

      The study described in this manuscript ties in to the findings from two prior genetic screens carried out in C. elegans that aimed to identify immune regulators (Ren et al., Cell Reports, 2022 and Labed et al., Immunity, 2018). Though their strategies differed, both of these previous studies uncovered a role for acetylcholine receptors in modulating the response to ingested microbial pathogens, especially when infection is associated with intestinal distension, indicating that a neuron-to-gut axis controls innate immunity in C. elegans. Labed and colleagues were the first to show that activation of this pathway results in the upregulation of genes encoding Wnt signaling pathway components, including the worm ortholog of beta-catenin called bar-1, which are necessary for the expression of immune effectors in the intestine. The Labed study also revealed that protein ubiquitination could contribute to regulating host defense gene induction because knockdown of lin-23, the substate binding subunit of a ubiquitin ligase complex that mediates BAR-1 degradation, results in constitutive expression of clec-60p::gfp, the same transcription reporter used by Ghosh and Singh as a readout for the expression of innate immunity genes. In their screen that revisits the Ren et al. approach, Ghosh and Singh find that another protein implicated in regulating protein stability via ubiquitination status, USP-14, also controls the expression of innate immunity genes in response to intestinal distension. Interestingly, their data indicate that it does so by upregulating bar-1. This discovery therefore adds an element of mechanistic detail regarding the regulation of Wnt signaling in immunity. While the Labed data suggest that ubiquitination may regulate BAR-1 at the post-translational level, Ghosh and Singhs' results indicate a second layer of regulation of bar-1 at the transcriptional level that also appears to involve ubiquitination. In this case, USP-14 is predicted to modulate the ubiquitination status of a yet-to-be-identified substrate that directly or indirectly governs bar-1 expression. The authors' findings thus bring the field closer to having a complete picture of the Ach-Wnt pathway in C. elegans. As they point out in the Discussion section of their manuscript, ubiquitination is an evolutionarily conserved yet complex means of tuning the immune system. The work described here helps to shed light on this important immune regulatory mode and could have implications for aspects of epithelial immunity that are in common to both invertebrates and vertebrates.

      My research interest and specific area of expertise pertains to evolutionarily conserved genetic pathways that control healthspan through affecting cellular resilience later in life. Using C. elegans as a surrogate for aging humans, my group studies age-dependent changes in the activity of regulatory modules that protect older animals from the molecular damage associated with intrinsic and extrinsic sources of cellular stress, with a particular emphasis on microbial infection and oxidative stress.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors describe the discovery of a molecular regulator of the immune transcriptional program, which is activated by intestinal distension upon bacterial colonization of the C. elegans intestine. Taking advantage of the fact that inhibition of aex-5 is known to cause intestinal distension and a C-type lectin gene clec-60 as a marker for the immune response to intestinal distension (clec-60p::gfp), the authors performed a forward genetic screen for suppressors of the immune response activation. Of the two mutants isolated, they focused on the stronger suppressor, which corresponded to a cysteine-type DUB, the Ubiquitin Specific Peptidase-14 (usp-14). Through rescue experiments, phenocopy analyses, and quantitative RT-PCR, they validated usp-14 as the causal gene and initiated characterization of its role in immune response activation. To this end, the authors investigated the tissue of action, identifying the intestine as the tissue in which usp-14 mediates the regulation of the immune response. Through transcriptomic analyses, they found that the signalling pathway likely regulated by usp-14 in response to intestinal distension is the Wnt pathway, as they have observed reduction in the transcriptional level of some of the Wnt pathway components in usp-4(tm1481), in response to infection with S. aureus. Additionally, transcriptomic data indicate that usp-14 plays a role in immunity regulation even in the absence of infection. Based on these findings, the authors propose that usp-14 has a dual role in immune regulation: one in surveillance immunity, preventing overactivation of immune responses, and another as a mediator of pathogen-induced responses, such as those triggered by P. aeruginosa or S. aureus. The experiments are rigorous and the results robust; however, some points would benefit from further investigation or clarification.

      The expression domain of usp-14 appears to be quite expanded based on single cell RNAseq data (e.g. PMID: 28818938) therefore it is likely that the transgenes used for expression analysis are lacking key regulatory information. Alternative methods like smFISH would be more appropriate to characterise the spatiotemporal pattern of usp-14 expression in more detail.

      The mutation mapped in usp-14(jsn19) is a missense mutation (E122K) that suppresses the immune response to a degree comparable to the usp-14(tm1481) deletion allele. However, the authors do not show the functional domains in Fig. 1E potentially affected by this missense mutation.

      How USP-14 regulates Wnt and how Wnt signalling relates to activation of immune responses is not fully supported. Are the Wnt components mentioned in the study induced specifically in the intestine upon infection and does USP-14 act in the intestine in the context of this regulation? How do the authors interpret that both Wnt ligands and receptors are induced ? Does Wnt signalling appear as a GO term in the transcriptomic analysis? The authors can include Wnt signalling components in the analysis of the transcriptomic results.

      Overall, in most of the figures, the micrographs are in general quite dark and exhibit poor contrast between signal and background, particularly in Fig. 1, panels B and J, and Fig. 2, panels B and F (upper rows). Even though these panels are intended to show absence of response, the outlines of the worms are difficult to discern.

      In Figure S3, panels A and B, the pmk-1(km25); usp-14(tm1481) animals subjected to aex-5 RNAi show some level of fluorescence/response induction comparable to pmk-1(km25) alone. This observation is not discussed in the text.

      Significance

      The work is interesting because it expands some previous work in the field demonstrating immune response induction as a consequence of intestinal distension even in the absence of bacterial infection. This is known to be mediated by the neuronal acetylcholine receptor ACC-4, which signals to the intestine where it regulates immune genes via the Wnt pathway. However, how USP-14 relates to ACC-4 is currently unclear and whether USP-14 function is really required in the intestine to control Wnt signalling is not demonstrated. The authors should include a model to describe how their findings relate to the previous literature and how USP-14 may link mechanistically to Wnt signalling pathway activation.

      It remains also unclear whether usp-14 is the only deubiquitinase involved in intestinal distension-induced signalling via the Wnt pathway, or whether other paralog usp genes might also contribute to regulation of immune-responsive transcription. Notably, several mammalian deubiquitinases have established roles in cancer suppression and inflammatory response and innate immunity in other systems so this would increase the potential significance of the work.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Revision Plan

      1. General Statements

      We thank the reviewers for their positive and constructive assessment of the manuscript. We are encouraged that all three reviewers recognise the value of coelsch as an open-source framework for haplotyping and crossover detection from single-cell gamete sequencing data, and that they view the study as a useful contribution to the fields of recombination and genetic research. We are particularly grateful that Reviewer 1 described the manuscript as an "interesting and important study" and a "genuinely useful methodological framework that fills a real gap in the recombination biology toolkit", while Reviewer 2 highlighted its "strong innovation, complete technical pipeline, and significant biological implications" and considered it an "important technical breakthrough". We also appreciate Reviewer 3's assessment that the study provides "timely guidance for experimental design", that the results are "important for guiding plant single-cell research" in general, and that the work "has the potential to attract a broad readership".

      In our view, the main contribution of the manuscript is the development of a platform-agnostic method for recovering haplotypes and crossover events from single-cell sequencing data. This addresses an important practical gap: single-cell gamete sequencing has strong potential for high-throughput haplotyping and recombination mapping, but its broader use requires tools that can accommodate the very different coverage structures produced by different sequencing modalities and platforms. coelsch was designed to meet this need.

      The experimental datasets in the manuscript serve two purposes. First, they demonstrate that coelsch can be applied across multiple single-cell modalities and platforms, including scRNA, scATAC and scWGA sequencing from 10x Genomics, BD, and Takara platforms. Second, they illustrate the kinds of biological and practical questions that can be addressed with single-cell gamete sequencing, including crossover detection in meiotic mutants and large-scale analysis of natural variation in recombination.

      While all reviewers strongly supported the publication of the work, they also raised important points about specific aspects, including technical variation and reproducibility, the rationale for using 10x scRNA to generate the diversity panel dataset, and the effects of coverage on crossover localisation, amongst others. We agree that addressing these points will make the manuscript clearer and more useful to readers. Our planned revisions therefore aim to strengthen the experimental and computational support for the framework, clarify the interpretation of the modality comparisons, and provide additional guidance for researchers who may wish to apply coelsch or related single-cell sequencing approaches in future studies.

      2. Description of the planned revisions

      2.1. Additional technical replicates and clearer treatment of batch/sample-handling effects

      Reviewers 1, 2 and 3 all noted that the comparison of different platforms and modalities is based on limited replication, with different nuclei isolation and processing strategies used for different technologies. Reviewer 3 requested a fully controlled benchmark in which the same nuclei preparation is split across all tested platforms. We agree that this would be the ideal design for a dedicated head-to-head benchmarking study. However, the primary aim of the manuscript is to demonstrate the applicability of coelsch across different single-cell sequencing data types, rather than to provide a definitive benchmark of the intrinsic performance of each modality and platform.

      In addition, a fully matched and replicated cross-platform experiment for all technologies is not feasible. Isolated nuclei deteriorate rapidly after preparation and must be processed promptly for single-cell library construction; this makes it impractical to distribute the same preparation across multiple time- and labour-intensive workflows. However, this design is feasible for 10x scRNA-seq and 10x scATAC-seq. To address this point directly, we will therefore generate two matched technical replicates each of 10x scRNA-seq and 10x scATAC-seq from nuclei isolated in the same sorting run.

      We will also improve our library-level QC summary tables. We will report, where available, the number of nuclei used for loading, recovered barcodes, barcodes retained after QC, inferred high-quality nuclei and artefacts, informative fragments per nucleus, genomic bin coverage, and final nuclei used for crossover calling. This will make the effects of loading, capture efficiency, QC filtering, and modality-specific data loss more transparent.

      In the revised text, we will distinguish more clearly between modality-specific effects and possible batch/sample-preparation effects. Where the current manuscript implies that differences are intrinsic properties of sequencing platforms, we will soften the interpretation unless supported by the new replicate data, reproducibility analyses, or well-supported properties that have been reported previously in literature.

      2.2. Rationale for using 10x scRNA-seq in the natural variation panel

      Reviewers 1 and 3 asked why the natural variation panel was analysed using 10x scRNA-seq, given that Takara scWGA produced higher per-cell crossover localisation accuracy in the modality comparison. We will revise the manuscript to explain this experimental decision more clearly.

      The natural variation panel was designed as a high-throughput experiment requiring sufficient numbers of usable nuclei from many pooled F₁ hybrids. In our hands, 10x scRNA-seq has generally produced the largest number of usable nuclei barcodes and the lowest proportion of artefacts. This makes 10x scRNA-seq well suited to experiments where many nuclei are required per genotype. By contrast, applying Takara scWGA to a pooled panel of this scale would be expected to recover only tens of usable nuclei per F₁ hybrid, which would be insufficient for robust recombination-rate or landscape estimation.

      We will add this explanation to the relevant Results section and clarify that the choice of 10x scRNA-seq reflects a trade-off between per-cell crossover resolution and the number of informative nuclei recovered per genotype. We will also add genotype-level summaries for the pooled natural variation experiment, including assigned nuclei per genotype and genotype-specific genomic coverage of informative fragments.

      2.3. Reproducibility of recombination landscapes across replicates and modalities

      Reviewer 1 requested recombination landscape plots for all tested modalities, and several comments raised the need to show within-modality reproducibility. We will add recombination landscape plots for wild-type Col-0 × Ler libraries across the tested modalities, including the newly generated replicate 10x scATAC and scRNA libraries.

      We will assess reproducibility using comparisons of unsmoothed, non-overlapping windowed recombination-rate estimates, both within and between modalities. These will be quantified using bootstrapped estimates of spearman rank correlation coefficient, and visualised using scatterplots and/or recombination landscapes.

      2.4. Sequencing depth, coverage, and crossover localisation resolution

      Reviewers 1 and 3 requested clearer quantitative reporting of crossover resolution and a stronger analysis of depth effects. We will revise the manuscript to report practical crossover localisation resolution for each modality, including median and interquartile localisation error or interval size in genomic units.

      We will expand the simulation analyses to compare false-positive and negative rates and localisation accuracy across modalities, including telomere-proximal error profiles for scWGA and scATAC as well as 10x RNA data. We will perform downsampling analyses to assess how crossover detection accuracy changes as a function of informative-fragment depth. Where feasible, we will compare depth-matched subsets across modalities to distinguish effects of sequencing depth from modality-specific coverage structure.

      These analyses will be used to clarify the extent to which each modality is suitable for different applications, such as broad landscape estimation, crossover counting, or fine localisation.

      2.5. Artefact detection, high doublet rates, and representativeness after filtering

      All three reviewers raised concerns about the high proportion of barcodes excluded by the filtering procedure, particularly in the Takara scWGA dataset. In hindsight, we believe part of this concern stems from the poor choice of terminology ("doublets") we used to describe these excluded barcodes.

      While true doublets (i.e. two nuclei entering a single droplet or nanowell) are one likely source of such signals, the filtering procedure more broadly identifies artefactual barcodes that do not exhibit a clear single-gamete haplotype structure. These barcodes may arise from a variety of sources, including doublets, multiplets, high levels of ambient DNA or RNA, or empty droplets containing only ambient material. Although visual examination can be used to make predictions about the source of these artefacts, our detection method does not attempt to distinguish between them, and artefacts in different modalities may stem from different sources in varying proportions. We will therefore revise the terminology throughout the manuscript to clarify that these represent a broader class of low-confidence or noise barcodes, rather than confirmed doublets.

      For the Takara scWGA data, we will revise the manuscript to discuss the discrepancy between the CellSelect well classifications (which uses proprietary software to label doublets) and the final artefact predictions from coelsch. We can only speculate as to why CellSelect failed to detect many apparent doublet and multiplet artefacts in this experiment, but we agree with the reviewer that the most likely explanation is the small size of Arabidopsis pollen nuclei relative to the expectations of the imaging and classification procedure. To support this interpretation, we will add supplementary analysis comparing the CellSelect images from individual nanowells with the final doublet predictions inferred from scWGA data. This will allow readers to see examples of wells classified as acceptable by CellSelect but subsequently inferred to contain artefacts based on their haplotype structure.

      We will also add sensitivity analyses showing how key results change under different artefact-filtering thresholds. These analyses will include crossover count distributions, recombination landscape estimates, and modality-level comparisons. We will examine the extreme upper tail of crossover counts observed in 10x scATAC-seq and assess whether these barcodes are artefacts that have escaped detection.

      Finally, we will assess whether retained singlets are representative of the input data with respect to informative-fragment counts, coverage, and inferred crossover patterns. This will address the concern that filtering could preferentially remove nuclei with particular recombination profiles.

      2.6. Biases arising from pollen nuclear biology

      Reviewer 2 raised an issue concerning the biases arising from the two different nuclei types present in mature trinuclear Arabidopsis pollen, and reviewer 3 endorsed this point. While we do not agree with the reviewer that scRNA and scATAC cannot capture sperm nuclei due to their condensed nature (see Parker et al. 2025 PLoS Biology for evidence against this claim), it is true that technical variation in nuclei isolation and sorting may affect the relative representation of nuclei types - usually, however, resulting in the underrepresentation of vegetative nuclei (Parker et al. 2025). We will add text addressing this point to the manuscript.

      It is also true that differences in expressed genes between vegetative and sperm nuclei, which have very different transcriptomic profiles, will affect the distribution of informative reads for crossover analysis in scRNA data, and therefore may also have an impact on the recovered recombination landscapes (despite that the underlying landscapes are biologically identical). We will address this in the manuscript by adding recombination landscape plots and reproducibility scatterplots (as described in point 2.3) comparing sperm and vegetative nuclei from scRNA-seq to the manuscript.

      2.7. Robustness of the pipeline and parameter choices

      Reviewer 3 raised the concern that quantitative conclusions depend on a single pipeline with fixed parameter choices. We will address this by adding a parameter-sensitivity analysis for the main computational steps. Specifically, we will test the robustness of crossover calling on simulated data to changes in bin size and rHMM parameters, showing how these affect sensitivity to noise and agreement of predictions with ground truth data.

      2.8. Natural variation analysis: genotype-specific coverage and terminal crossover enrichment

      Reviewers 1, 2 and 3 raised concerns about whether natural variation in crossover rate and terminality could be influenced by genotype-specific coverage, marker density, pooling imbalance, or dropout. We will add a more detailed description of how pollen from different F₁ hybrids was pooled and how genotype assignment was performed. We will report genotype-level recovery statistics, including the six hybrids excluded from downstream analysis, and discuss how imbalances may arise, e.g. through biological variation in pollen count and fertility, biases in nuclei isolation or sequencing, and biases in genotyping and informative fragments.

      Reviewer 1 specifically asked whether the lower terminal crossover index observed in Cvi-0 crosses compared with Col-0 crosses could reflect systematic differences in informative-fragment distributions rather than true biological differences in crossover localisation. We will address this by using the genotype-specific informative-fragment distributions observed in the diversity-panel scRNA-seq dataset to simulate crossover datasets with known ground truth. This will allow us to test whether differences in marker variant or expressed-gene distributions causing variation in informative-fragment distribution could systematically bias terminal crossover detection in Cvi-0 crosses relative to Col-0 crosses.

      If feasible within the revision timeframe, we will also perform an orthogonal validation experiment for a selected comparison showing a clear difference in crossover terminality, such as Col-0 × Sah-0 and Cvi-0 × Sah-0. This would use progeny sequencing of backcross populations to estimate recombination landscapes independently of single-cell scRNA-seq, providing a direct test of whether the inferred terminality difference is supported by conventional recombination mapping. If this experiment cannot be completed within the revision timeframe, we will clearly state this limitation and base the revised interpretation on the simulation analyses described above.

      2.9. Broader applicability and practical guidance for users

      Reviewer 1 requested more discussion of applicability beyond Arabidopsis and to outcrossing or polyploid species. We will expand the Discussion to address the requirements and limitations of applying coelsch in other systems.

      2.10. Minor figure, reference, and presentation revisions

      We will address the remaining minor comments, including adding missing axis labels and checking duplicated references.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      No revisions have yet been incorporated in the transferred manuscript.

      4. Description of analyses that authors prefer not to carry out

      4.1. Full new benchmark across all modalities from the same nuclei preparation.

      As acknowledged in section 2.1, we agree with Reviewer 3 that a fully controlled benchmark in which the same isolated nuclei preparation is split across all tested platforms would be the ideal experimental design for separating intrinsic modality- or platform-specific effects from sample-handling and batch effects. However, this is not feasible for all technologies within the scope of this revision, because isolated nuclei degrade quickly, the single-cell sequencing methods are time- and labour-intensive, and the relevant platforms are not all available to us in the same location.

      We will therefore not perform a complete new cross-platform benchmark across all modalities. Instead, we will address this issue in the parts of the experiment where a matched design is feasible: we will generate two additional matched technical replicates each for 10x scRNA-seq and 10x scATAC-seq from nuclei isolated in the same sorting run. We will also revise the manuscript to more clearly acknowledge the limitations imposed by the lack of a fully matched cross-platform design and to ensure that our conclusions are interpreted in that context.

      4.2. Profiling the natural variation panel with a second modality

      Reviewer 1 suggested profiling at least a subset of the diversity panel with an additional single-cell modality. We agree that this would be useful, but we do not currently plan to generate a second-modality dataset for the natural variation panel. We would like to point out that this dataset introduces 34 genetic maps in a single sequencing experiment, which is not easily repeated.

      The natural variation experiment was designed as a high-throughput survey across many F₁ hybrids, and repeating even a subset with scWGA or scATAC would require substantial additional sample preparation and sequencing. Instead, we will strengthen the justification for the use of 10x scRNA-seq by adding genotype-level coverage summaries and simulations to show which conclusions are well supported at the observed data density.

      4.3. Orthogonal progeny sequencing from the exact same F₁ plants

      Reviewer 3 suggested that progeny sequencing from the same F₁ plants used for single-cell assays would provide a direct ground truth. This experiment would require additional crosses, progeny generation, and matched single-cell and progeny sequencing, which would not be justified by the insights that this effort delivers: While progeny sequencing can provide an independent validation dataset, we do not agree that it would constitute a substantially better ground truth than the simulations used here. Simulations provide a known ground truth for every individual barcode, whereas progeny sequencing cannot, for the obvious reason that pollen grains are destroyed during single-cell sequencing and therefore cannot be used to generate offspring. In addition, progeny-derived recombination landscapes are not a perfect ground truth at the population level, since segregation distortion and post-meiotic selection can alter the observed distribution of recombination events relative to those present in the original pollen population.

      4.4. Formal benchmarking of ____coelsch____ as a structural-variant detection method

      Reviewer 2 asked whether large structural variants were identified in other accessions besides Zin-9, and what sensitivity and specificity can be expected from recombination coldspot-based structural-variant detection. We agree that this is an interesting question, given that the Zin-9 inversion was identified through its strong effect on recombination. However, we do not plan to develop or benchmark coelsch as a comprehensive structural-variant detection method as part of this revision.

      The Zin-9 event was identified by visual inspection of the recombination maps, where it appeared as an unusually large and conspicuous recombination coldspot. We did not develop a systematic structural-variant calling procedure, as we do not view recombination suppression alone as a sufficiently specific signal for structural-variant detection. Coldspots can arise for many reasons, including centromere proximity or local recombination modifiers. Therefore, although large rearrangements such as inversions or translocations may sometimes be detectable through their effects on recombination, coelsch should not be considered as a general-purpose structural-variant caller.

      In the revised manuscript, we will clarify this limitation and avoid implying that recombination coldspot analysis provides comprehensive structural-variant discovery. We will report that we did not observe other genotype-specific coldspots of comparable scale to the Zin-9 event among the other analysed accessions, although smaller coldspots such as one corresponding to the previously reported 2.2Mb inversion on Chromosome 1 of N13 were identifiable. We will not provide formal estimates of sensitivity and specificity for structural-variant detection, as this would require independent benchmark datasets or dedicated simulations that are beyond the scope of the present study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study, Parker et al. benchmark three single-cell sequencing modalities (scRNA-seq, scATAC-seq, and scWGA) in Arabidopsis gametes and deliver an open-source, end-to-end framework for data processing that enables high-throughput crossover mapping across hybrids. By systematically comparing these modalities, the work quantifies trade-offs in throughput, genomic coverage, and crossover detection sensitivity, offering timely guidance for experimental design in plant systems where single-cell genomics is still emerging and platform benchmarks are very limited. The pipelines are further supported by the discovery of a previously unrecognized ~10 Mb pericentric inversion in the Zin-9 accession. The experimental design is technically interesting, and the results are important for guiding plant single-cell research. The work has the potential to attract a broad readership. However, several aspects of the experimental design, validation strategy, and parameter robustness require further clarification and, where possible, additional analyses.

      Major comments

      1. The modality comparison is based on one scRNA-seq library and two libraries each for scATAC-seq and scWGA. While the limited replication is acknowledged in the Discussion, the authors also report unexpected and run-specific observations (e.g. unusually high doublet rates in the 10x scRNA-seq library; "unexpected" doublet behavior in scWGA), making it difficult to separate platform-intrinsic properties from sample preparation and run-to-run variation. Differences in nuclei isolation buffers, purification strategies (e.g. density gradients, FACS, centrifugation), and potentially loaded nuclei numbers between platforms (which have not been specified in detail) further confound modality-level conclusions. For example, total usable barcodes vary drastically between the samples (e.g. 15k/20k/33k for 10x scRNA-seq, only 3.8k for BD even though it has the same capture capacity as 10X). Do these differences reflect different capture efficiencies between the platforms, or variation in nuclei quality/quantity, or modality-specific limitations in QC thresholds? It would strengthen the study to provide, for each library, the number of nuclei prior to loading and before/after QC, and to add independent biological replicates under modality-appropriate, optimized handling, ideally including a design where the same nuclei pool is split across all three modalities.
      2. All quantitative inferences rely on one custom analysis pipeline with multiple interdependent steps and fixed parameter choices (e.g. bin size, HMM transition structure, smoothing settings, background subtraction, doublet filters). The lack of benchmarking against independent crossover callers, or of systematic parameter sweeps, leaves it unclear how robust key patterns are to alternative analytical choices. It would substantially increase confidence to assess sensitivity of the main conclusions to key parameters (for example varying bin size, rigid chain length/transition penalties, enabling/disabling background subtraction and doublet filtering), and/or compare coelsch to other HMM-based crossover callers such as sgcocaller/comapr on at least a subset of the data.
      3. Accuracy is evaluated by comparisons to prior backcross/progeny datasets generated in different conditions, and by simulations calibrated to those references. While this is informative, systematic biases shared between the new pipeline and the reference datasets could remain undetected. Internal, orthogonal validation (e.g. progeny sequencing performed on the same F₁ plants used for single-cell assays) would provide a more direct ground truth and avoid potential circularity in bias assessment.
      4. The benchmark does not evaluate the impact of sequencing depth across modalities, which could influence the variation in per-barcode fragment counts and genomic bin coverage between scRNA-seq, scATAC-seq, and scWGA. Down-sampling aligned reads or informative fragments to fixed per-barcode targets (e.g. 250, 500, 1000 informative fragments) within each modality would clarify how much of the observed performance gap is attributable to depth rather than modality-specific biology or library structure. Constructing depth-matched subsets between scWGA and scATAC/scRNA datasets would help to test whether the breadth vs. depth trade-offs persist when sequencing resources are equalized.
      5. In the pooled 34-hybrid single-nucleus RNA-seq dataset, it would be very informative to present detection sensitivity and resolution across genotypes (e.g. captured nuclei, distributions of informative fragments, covered bins, and expected localization error by genotype). Genotypes will differ in expression patterns, which will alter the number and distribution of informative fragments per nucleus, and thus ultimately influence inferred recombination rates and crossover terminality. Furthermore, the background subtraction filter relies on genotype-level background models. Given that all genotypes were pooled prior to nuclei isolation, can the authors show that estimated ambient/background profiles are comparable across genotypes?

      Minor comments

      1. The manuscript currently attributes more uneven coverage in scRNA-seq primarily to expression-biased sampling of heterozygous sites. Would the choice of using nuclei, rather than whole cells which would also allow the capture of cytosolic RNA, for the scRNA-seq be an additional reason for lower total number and genomic dispersion of informative fragments?
      2. The sentence "This allows informed experimental and analytical choices ..." could be accompanied with a compact infographic or table (for example as an extension of Fig. 1B) summarizing key trade-offs and recommended use-cases for each modality (throughput, per-cell resolution, coverage breadth, susceptibility to doublets/ambient RNA, recommended applications).
      3. Related to the point above, the choice to profile the F₁ hybrids using the 10x scRNA-seq modality is understandable from a throughput perspective, but the results presented in Fig. 1 and Table 1 suggest scWGA offers higher crossover accuracy, scATAC superior genomic breadth, compared to 10x scRNA-seq which in addition also showed a high doublet rate. Expanding the rationale for prioritizing scRNA-seq here (e.g. cost, compatibility with downstream expression analyses, or technical constraints for scWGA/scATAC at this scale) would clarify the experimental logic for the reader.

      Referee cross-commenting

      I strongly agree with the points raised by Reviewers #1 and #2. In particular, including additional replicates (ideally derived from the same pollen pool, processed identically and run across all modalities) would provide robustness to the benchmark. However, repeating these experiments, re-running the benchmark, and updating the interpretation would require substantial additional time, likely exceeding the suggested 1-3 month revision timeframe proposed by the other reviewers. Additional clarification of the analysis and representation of requested details (e.g. the recombination landscape plots (Reviewer #1), clarification of balanced pollen representation from each F₁ during pooling (Reviewers #2 and #3), and evaluation of how varying filtering strategies (e.g. doublet detection thresholds) affect the observed recombination patterns (Reviewers #2 and #3)) would also improve evaluation and transparency of the study. From a technical perspective major point 3 raised by Reviewer #2 (including information on the intrinsic biological characteristics of the material in the modality performance analysis) would provide substantially important context for users and improve interpretation of the benchmark.

      Significance

      Previous studies have successfully applied single-cell whole-genome amplification and linked-read sequencing to individual gametes to measure recombination rates and distributions, demonstrating the feasibility of this high-throughput alternative to progeny sequencing. This study extends that concept by delivering open-source pipelines for multiple single-cell modalities and by directly comparing the performance of scRNA-seq, scATAC-seq, and scWGA for mapping meiotic recombination in Arabidopsis gametes, offering both a practical resource and a performance evaluation for plant single-cell genomics.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript presents coelsch, a cross-platform computational framework for single-cell gamete recombination analysis. It systematically benchmarks the performance of four mainstream single-cell sequencing modalities in meiotic crossover detection, successfully applies the method to a natural variation panel of Arabidopsis thaliana, and identifies the largest natural inversion reported in this species to date. This work demonstrates strong innovation, a complete technical pipeline, and significant biological implications. I would like to recommend revision. My concerns are listed below for the authors' consideration and revision.

      Major concerns

      1. Biological Replicates and Batch Effect Control The number of biological replicates per sequencing modality is limited (2 libraries for 10x scATAC and Takara scWGA, 1 library each for 10x scRNA and BD scRNA), and experiments for different modalities were performed in separate batches. Have the authors evaluated the impact of inter-batch technical variation on recombination rate estimates? In particular, for platforms with drastically different doublet rates (e.g., 49.7% for 10x scRNA vs. 26.3% for BD scRNA), how did the authors distinguish or avoid inherent platform differences from batch effects?

      The natural variation analysis used a pooled library strategy for 40 F₁ hybrids without biological replicates. How did the authors ensure balanced pollen representation of each F₁ during pooling? For the 6 F₁ hybrids excluded due to insufficient data, was this due to initial pooling bias or sequencing capture preference? Could this introduce systematic bias into the natural variation analysis results? 2. Consistency of Pollen Nuclei Isolation Methods Different nuclei isolation protocols were used for each sequencing modality: Percoll density gradient centrifugation for 10x scATAC, no Percoll purification for Takara scWGA, and flow cytometry sorting combined with 10x/BD scRNA. Have the authors assessed how these different isolation methods affect nuclei integrity, viability, and capture bias for pollen nuclei? For example, could flow cytometry sorting selectively exclude nuclei of specific sizes or densities, thereby compromising the representativeness of recombination rate estimates? 3.Systematic impact of the inherent structure of pollen on different sequencing modalities Mature Arabidopsis thaliana pollen has a canonical trinucleate structure, consisting of one transcriptionally hyperactive vegetative nucleus and two sperm nuclei with highly condensed chromatin and almost complete transcriptional silencing. While all three nuclei share identical genome sequences, they exhibit fundamental differences in chromatin state and molecular features, which will have profoundly distinct effects on different sequencing modalities-an issue not addressed or controlled for in this study.

      Differential technical capture bias: scRNA-seq and scATAC-seq rely on mRNA and accessible chromatin signals, respectively, and thus theoretically can only capture valid data from vegetative nuclei; sperm nuclei will be filtered out during quality control due to insufficient signal. In contrast, scWGA is based on whole-genome DNA amplification, independent of transcriptional activity or chromatin state, and can capture both vegetative and sperm nuclei. Have the authors validated the actual nuclear type composition in datasets from each modality through experiments (e.g., nuclear size sorting, DAPI staining quantification, immunofluorescence labeling)? Could this systematic difference in nuclear type composition compromise the fairness of performance comparisons between modalities? The uneven coverage of scRNA/scATAC is primarily determined by gene expression levels and chromatin accessibility (e.g., high coverage at highly expressed genes, extremely low coverage at heterochromatic regions such as centromeres), whereas coverage bias in scWGA mainly stems from technical preferences of whole-genome amplification. When comparing the resolution and accuracy of recombination detection across modalities, did the authors clarify the contributions of "intrinsic biological characteristics of nuclear types" from "technical characteristics of the sequencing technologies themselves"? 4. Accuracy and Validation of Doublet Detection Method This study reports exceptionally high doublet rates (~49% for 10x scATAC, ~70% for Takara scWGA), and there is a significant discrepancy with the results from Takara's official CellSelect software (80% of wells labeled "Good" by CellSelect were classified as doublets by coelsch). Have the authors validated the false positive and false negative rates of coelsch's doublet detection method through independent experiments (e.g., mixing pollen of known genotypes, manual microscopic validation of selected wells)? Such a high doublet filtering rate leads to a drastic reduction in the number of effective cells (e.g., only 628 singlets remained from a total of 2081 barcodes in the two Takara scWGA libraries). Have the authors assessed the representativeness of the remaining cells after filtering? In particular, for low-coverage scRNA data, could filtering result in the loss of cells with specific recombination patterns? 5. Depth and Breadth of Natural Variation Analysis This study finds significant differences in recombination rate and terminal crossover enrichment among different natural accessions, with Cvi-0 hybrids exhibiting higher overall recombination rates but lower terminal recombination rates. Have the authors further explored the genetic basis underlying these differences? Besides the 10 Mb inversion in Zin-9, did the authors identify similar large structural variations in other natural accessions? What is the sensitivity and specificity of the recombination coldspot-based method for detecting structural variation? For example, what is the minimum size of inversions or translocations that can be reliably detected?

      Minor concerns

      • The mutants used in this study (zyp1, figl1, recq4ab, etc.) were generated by crossing mutant lines in the Col-0 background with corresponding mutant lines in the Ler background, resulting in heterozygous F₁ backgrounds. For example, the zyp1 mutant used Col-0 background zyp1-1 and Ler background zyp1-6. Could this heterozygous mutant background affect the accurate measurement of meiotic processes and recombination rates? Have the authors considered validation using F₁ populations from homozygous mutant lines?
      • The Takara scWGA dataset for wild-type Col-0 × Ler contains only 224 high-quality nuclei, while mutant sample sizes range from tens to hundreds. Is this sample size sufficient for fine-scale analysis of recombination rate distributions, especially for the detection of low-frequency recombination events? There are also a few minor issues regarding the references-some appear to be duplicates, such as references 11 and 31, which seem to be the same in both the published version and the bioRxiv preprint. Please double check. Additionally, have the authors considered the cost implications of these single-cell-based technologies, as well as their previously published linked-read sequencing approach?

      Overall, this manuscript represents an important technical breakthrough in the field of meiotic recombination research, providing a unified computational framework for large-scale, cross-platform single-cell gamete recombination analysis. The above questions mainly focus on the rigor of experimental design (especially the omission of the unique biological issue of pollen trinucleate structure), the depth of computational method validation, and the expansion of biological findings, and do not affect the core conclusions of the manuscript. I suggest that the authors address these questions and provide clear responses in the revised manuscript. If these issues are properly resolved, this work will provide a powerful tool for investigating the genetic and molecular mechanisms of plant meiotic recombination.

      Referee cross-commenting

      I agree with Reviewers 1 and 3. Addressing most of the points we raised would bring this manuscript to publication standard.

      Significance

      This study develops a unified computational framework for meiotic crossover (CO) mapping using single‑cell sequencing of Arabidopsis pollen, benchmarks four single‑cell modalities, and identifies natural recombination variation and a large novel pericentric inversion. Overall, the work is technically sound, biologically meaningful, and fills a key gap in scalable gamete‑based recombination profiling.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Parker et al. present coelsch and coelsch_mapping_pipeline, two open-source tools for platform-agnostic haplotyping and crossover detection from single-cell sequencing data, benchmarked across four modalities: 10x scATAC, 10x scRNA, BD scRNA, and Takara scWGA. The study applies these tools to Arabidopsis thaliana F₁ pollen to recover known recombination frequencies, characterise the effects of coverage sparsity via simulation, and profile natural variation in crossover rate and distribution across 34 F₁ hybrids from 22 diverse accessions. As a by-product of the recombination maps, the authors identify a previously unrecognised ~10 Mb pericentric inversion in the accession Zin-9 - the largest natural inversion described to date in A. thaliana.

      This is an interesting and important study and is suitable in scope and rigour for publication in a Review Commons affiliate journal. By combining computational and experimental framework, the authors address a genuine methodological gap: while single-cell gamete sequencing is a powerful approach for recombination mapping, the consequences of choosing among available sequencing modalities have not been systematically evaluated. The tools are open-source, data are deposited, and the biological conclusions are well-grounded. Importantly, the limitations of the tools are also mentioned, which is appreciated. Therefore, this manuscript presents a genuinely useful methodological framework that fills a real gap in the recombination biology toolkit. The biological discovery (Zin-9 inversion) adds independent value. However, several analytical choices require better justification, some results sections are under interpreted, and a number of presentation issues should be addressed before acceptance.

      Major comments:

      1. Mismatch between best-performing modality and diversity panel application

      The most critical concern is a logical inconsistency in the experimental design. The authors demonstrate convincingly that Takara scWGA achieves higher per-cell resolution and more accurate crossover detection than the droplet-based RNA methods. Yet the diversity panel - the study's key biological application - is analysed exclusively using 10x scRNA. No comparison with other modalities is provided for the panel, and no external recombination data for these accessions are included for validation. The authors should either: (i) include at least a subset of accessions profiled by an additional modality; or (ii) provide a more thorough quantitative justification for why 10x scRNA throughput outweighs the loss of resolution in this specific context, showing that cross-accession comparisons remain interpretable at scRNA coverage levels. 2. Could variation in crossover terminality result from analysis artefacts?

      The authors demonstrate consistently higher rates of terminal crossovers in Col hybrids than in Cvi hybrids, 'implying genetic background modulation of crossover localisation'. However, their simulation analysis also demonstrates that telomere proximal crossovers are disproportionally missed in 10x RNA data. Therefore, could the Col vs. Cvi terminality differences result from a greater/lower occurrence of false negatives in different genotypes using this approach, rather than bona fide differences in CO number (caused by e.g. differences in telomere proximal marker density in Col vs. Cvi)? If so, this should be explicitly mentioned.<br /> 3. Doublet rates in Takara scWGA are unexplained

      The Takara iCELL8 platform implements microscopy-based automated well selection to prevent doublets, yet coelsch identifies a ~70% doublet rate in these libraries. This is mentioned briefly but not adequately explained in the main text. The authors should provide a more thorough explanation for why the CellSelect imaging software fails to exclude pollen nuclei doublets (likely due to small nuclear size), and they should discuss what this implies for the utility of this platform for future experiments. This is important practical information for readers considering the Takara workflow. 4. Recombination landscape figures are incomplete

      Figure 2C shows recombination landscapes only for mutant genotypes profiled by Takara scWGA. Equivalent per-chromosome landscape plots should be provided for all modalities tested on wild-type Col-0 × Ler material. This is essential to visually communicate the coverage-driven differences in landscape resolution that the authors describe, and to verify that 10x scATAC and scRNA recover similar gross distributions despite lower per-cell depth. 5. Extreme crossovers number in 10x scATAC are not discussed

      The violin plots in Figure 2A show that 10x scATAC produces a wider upper tail of estimated crossover numbers than other modalities, with some barcodes exceeding 20 crossovers per nucleus - values far above the biological expectation for Arabidopsis. This is not acknowledged or explained. Is this an artefact of the high doublet contamination in this dataset (even after filtering), or a property of the HMM applied to fragmented ATAC data? An explicit discussion or supplementary analysis is required. 6. Resolution of crossover detection is undereported

      Figure 3C shows boxplots of crossover localisation error across modalities, but this analysis is not discussed quantitatively in the main text. Readers need to understand the practical resolution (in kb) achievable by each modality in terms of crossover interval size. This is particularly important because the paper claims applicability for genetic mapping experiments, where localisation precision directly determines utility. 7. Telomeric false-negative rate in scWGA is not reported

      The simulation analysis of false negatives near telomeres (Figure 3B) is presented only for 10x RNA data. Given that the authors use Takara scWGA for mutant genotyping and claim higher sensitivity, it is critical to also show the telomeric false-negative profile for scWGA. The current text implies that scWGA should avoid this problem, but this is not demonstrated. 8. Comparison between libraries from the same modality is absent

      Two independent 10x scATAC and two Takara scWGA libraries were generated, but no within-modality reproducibility analysis of crossover rates or landscapes is presented. Crossover rates and landscape correlations between technical replicates should be shown to establish that the observed modality-level differences are not driven by library-preparation variability. 9. Applicability to non-Arabidopsis and heterozygous species

      The Discussion notes that the approach relies on isogenic founder crosses and high-quality parental assemblies but does not explore the practical barriers to applying coelsch in outcrossing or polyploid species. Given the broad framing of the title ('platform-agnostic'), the authors should discuss what adaptations would be needed for crop species or other organisms where chromosome-scale haplotype-resolved assemblies are not available.

      Minor comments:

      1. Figure 5B - Please add axis labels in Mb.
      2. Figure 2A - library replicates: The two 10x scATAC libraries are not differentiated in Figure 2A. Showing them separately (or indicating per-library medians) would improve transparency.
      3. Droplet vs. plate combination: The Discussion does not address whether complementary modalities could be combined (e.g., using droplet-based data for landscape estimation and scWGA for localisation refinement within the same experiment). A brief discussion of this possibility would strengthen the practical utility of the framework.

      Referee cross-commenting

      All points raised by reviewers 2 & 3 seem reasonable and would substantially improve the quality of the manuscript

      Significance

      General assessment: The paper from Parker et al., provides the first systematic evaluation of single-cell sequencing modalities for recombination mapping in Arabidopsis and presents new bioinformatic tools for analysing recombination in single-cell data. The novel utility of the approach is demonstrated for assessing recombination rate across a wide variety of Arabidopsis hybrids. Different platforms provide different benefits/limitations and these are well presented. However, the manuscript would benefit from a more thorough presentation of all the different analyses that were performed.

      Advance: Most recombination mapping studies in Arabidopsis utilise progeny sequencing. Here, the authors present an alternative approach, using single-cell gamete sequencing which will more easily facilitate recombination mapping in large populations, which will be particularly useful for future studies investigating the influence of natural variation on recombination rate and location. The advance is mostly technical, but the study also generates novel biological observations about chromosome structural rearrangements in Arabidopsis.

      Audience: The study is likely to be of main interest to individuals studying recombination in plants (particularly using bioinformatic approaches and analysing the influence of natural variation). However, researchers with an interest in single-cell sequencing and broader genomics will also be an audience for this paper.

      Describe your expertise:

      I am a researcher in plant meiotic recombination and I am well placed to assess the general importance and impact of the study within the context of the field. However, I would not consider myself a specific expert in bioinformatics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RESPONSE TO REVIEWERS

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This is an interesting and ambitious study by Tabilo-Agurto and co-workers. It combines deep learning structure prediction (AlphaFold2), targeted molecular dynamics simulations, and in vivo functional assays to probe structural, functional, and evolutionary aspects of the metamorphic protein RfaH. More broadly, the work addresses an important question: whether intermediate structural states may exist along evolutionary trajectories of metamorphic proteins. A particular strength of the study is the integration of computational and experimental approaches. The manuscript is generally well written and clearly organized.

      Major comments:

      A key aspect of the study is the classification of predicted structures into three classes based on the conformation of the C-terminal domain (CTD): the autoinhibited alpha-helical fold, a beta-barrel fold, and a mixed alpha/beta fold. These classes are further described as corresponding to metamorphic (alpha fold), mixed alpha/beta, and monomorphic (beta fold) proteins.

      While I can see how this organizational scheme is helpful in some respects, it may also overstate what can be concluded from the data. As the authors are well aware, AlphaFold2 tends to predict a single conformation even for genuine metamorphic proteins, and therefore does not, on its own, distinguish between monomorphic and fold-switching proteins. I note in particular that the functional data indicates that the "monomorphic" variants studied in the in vivo assays behave similarly to the RfaH E48A mutant. However, E48A is known to remain metamorphic, populating both alpha and beta folds with roughly equal probability. This suggests that the sequences in this class may retain some degree of fold-switching capability, even if the underlying regulatory mechanism differs from that of wild-type RfaH. In other words, the presented data does not fully support these sequences as monomorphic. I am not suggesting that the authors must revise their classification scheme. However, it may strengthen the manuscript if the authors explicitly acknowledge this alternative interpretation and moderate the corresponding claims.

      We appreciate the comment from the reviewer, which can be seen from two different perspectives.

      On the one hand, it might be reasonable to think that the ‘monomorphic’ RfaH orthologs have lower transcription elongation activity than E. coli RfaH. Other highly divergent orthologs of E. coli RfaH (Salmonella enterica serovar Typhimurium, Klebsiella pneumoniae, Yersinia enterocolitica and Vibrio cholerae) have similar in vitro recruitment and pausing at the C45 nucleotide from the ops element, as well as restoring the RfaH-dependent hemolytic activity of E. coli in a strain that lacks chromosomal RfaH to levels similar to the wild-type strain (doi: 10.1128/jb.186.9.2829-2840.2004). However, V. cholerae RfaH (43% sequence identity to E. coli RfaH) exhibits diminished antitermination effects in in vitro transcription assays, better resembling the antitermination levels in the absence of RfaH (doi: 10.1128/jb.186.9.2829-2840.2004), despite this protein also being predicted in the alpha-folded state when using AF2 (10.1016/j.csbj.2022.10.024). A particular observation from the RfaH complementation work is that increasing the concentration of V. cholerae in in vitro transcription assays lessens the transcription elongation effects observed when using concentrations similar to E. coli RfaH. These transcription elongation defects can be extrapolated to potentially similar issues with transcription in vivo and, therefore, luciferase translation in our in vivo translation assays for our ‘monomorphic’ proteins.

      On the other hand, it is possible that these so-called ‘monomorphic proteins’ still populate the alpha-folded state, but that their predominant fold in solution is the one corresponding to the active beta-fold. This can be biophysically tested using circular dichroism to distinguish their alpha or beta propensity, as proposed in a remarkable work from Porter et al (10.1038/s41467-022-31532-9).

      In both cases, quantification of the protein titers obtained after attempts of protein purification of the ‘monomorphic’ RfaH orthologs would be required. In this way, we can ascertain whether the differences in activity are due to differences in expression levels and determine if sufficient amounts of stable and well-folded protein can be obtained for these RfaH orthologs, followed by measuring their circular dichroism spectra to ascertain their secondary structure propensity.

      Our current attempts are to recombinantly express these proteins for determining their protein titers and solubility in the supernatant, which will enable us to indirectly ascertain their expression levels, and test those solubly expressed proteins biophysically using circular dichroism experiments. If the circular dichroism experiments prove to be unsuccessful due to problems with the solubility of the purified proteins, we strongly believe that the aforementioned discussion should be included in the manuscript to take into account the limitations of the methods utilized in our work.

      Therefore, we will add the following paragraph in the discussion, while we work on ascertaining the feasibility of the circular dichroism assays:

      “It is worth noting that, in the absence of RBS (Figure 5C-F), the putative monomorphic RfaH orthologs have similar or lower in vivo activity than the E. coli RfaH E48A mutant; a similar mutant (E48S) exhibits a 1:1 equilibrium between the autoinhibited and active states (Burmann et al, 2012). This observation can be partly explained by two factors. First, sequence divergence and expression levels may limit functional compatibility with the host machinery. Highly divergent V. cholerae RfaH ortholog, which shares only 43% sequence identity with E. coli RfaH but is predicted to fold into the autoinhibited state (Artsimovitch & Ramírez-Sarmiento, 2022), maintains both ops-dependent recruitment and hemolysin secretion in the ∆rfaH E. coli strain, yet exhibits transcription elongation defects in vitro, requiring a 5-fold higher concentration than E. coli RfaH to match increased elongation rates of E. coli RNAP (Carter et al, 2004). Low in vivo protein titers or structural mismatches between the monomorphic orthologs and E. coli RNAP may prevent higher luciferase expression relative to the E48A mutant. This limitation is supported by the fact that IPTG-induced overexpression rescues activity when an RBS is present (Figure 5B). Second, these proteins may be predominantly folded in the active state while still transiently populating the autoinhibited state. Confirming this conformational equilibrium would require overexpression and purification of these proteins followed by biophysical assays, such as circular dichroism (Porter et al, 2022).”

      Reviewer #1 (Significance (Required)):

      An intriguing, but speculative, aspect of the study is the finding that some sequences are predicted to adopt a CTD with mixed alpha/beta secondary structure, and that such structures also appear in targeted molecular dynamics simulations. If this idea holds up, it could represent an intermediate along the evolutionary pathway between the alpha-helical and beta-barrel folds of RfaH. Although the evidence is only computational, it is a compelling idea and it would benefit from further investigation.

      It is indeed very compelling, and this is something that we should immediately address in a revised version of our manuscript. We somehow missed an article published in 2025, regarding the study of the structural interconversion of the isolated CTD using NMR, finding at least three intermediate states along the fold-switching pathway of RfaH (doi: 10.1073/pnas.2506441122). One of such intermediate states observed, which is also one of the highest populated ones (~23%), corresponds to an ensemble of largely unfolded structures that include the formation of transient alpha-helix a5 (corresponding to helix a2 in our article) and beta-hairpin (b1/b2) secondary structure elements, which fold to form a compact ensemble of structures in which the beta-hairpin lies on top of the alpha-helix. This is fully consistent with our predictions of a mixed alpha/beta state in full-length.

      We will add this external experimental validation of the mixed alpha/beta secondary structure of the CTD of RfaH in the discussion of our final manuscript:

      “Interestingly, a recent nuclear magnetic resonance spectroscopy study of the E. coli RfaH CTD, aimed to uncover transient states potentially en route of the αCTD interconversion (Cai et al, 2025), described an intermediate state (populated in ~23% of the captured ensembles) in which a β-hairpin formed by β-strands β1-β2 lies on top of a transient α-helix α5 that corresponds to helix α2 in our article. This finding is fully consistent with the mixed α/β CTD structures found both in our TMD simulations and our AF2 predictions of divergent RfaH orthologs.”

      In summary, the work is a valuable contribution to the field of protein fold switching. The combination of computational tools with experimental validation makes it interesting and the results should be of broad interest. The manuscript should be well positioned for publication in a high-impact journal.

      We are very thankful for the reviewer’s comments on our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In their paper "Exploration of the structural and functional diversity in the metamorphic RfaH subfamily," Tabilo-Agurto et al. use AlphaFold2 to predict the structures of ~3,900 RfaH homologs, sort the predicted C-terminal domains into α-helical (autoinhibited), β-barrel (NusG-like), and mixed α/β topologies, and find that about 14% of homologs come out predominantly in the β-barrel state. They then take nine representative homologs and run them through a heterologous *E. coli* DH5α Δ*rfaH* reporter assay. The putative monomorphic candidates behave a lot like the constitutively active E48A variant - active across every ops context and even without an RBS - while the mixed α/β candidates barely show activity. Targeted MD simulations of *E. coli* RfaH, run through AF2Rank, also pulls out the mixed α/β state as its own distinct cluster, hinting that it sits somewhere along the fold-switching transition path.

      This is a genuinely interesting piece of work that pulls together structure prediction, in vivo activity, and genomic context to make a concrete case for extant monomorphic βRfaH proteins - a long-hypothesized but until now unseen intermediate in the proposed stepwise evolution of RfaH from NusG. The experimental design is thoughtful, especially the five-construct ops/RBS matrix, and comparing the monomorphic candidates against the E48A benchmark is a nice touch as a positive control. Overall, I think the paper deserves to be published, but a few things would need shoring up before acceptance.

      Major comments:

      1. The paper would be a lot stronger with at least one biophysical measurement (a CD spectrum, say) on a purified monomorphic candidate. I get that this might be outside the planned scope, but even a single CD trace showing β-rich content for an isolated full-length protein would move the claim from "putative" to "demonstrated."

      We agree with the comment from the reviewer, and as such we are currently attempting to recombinantly express these proteins for determining first their solubility after purification (which will largely determine our ability to characterize them by circular dichroism) and then follow up with circular dichroism experiments if the solubility and protein concentration of these ‘monomorphic’ homologs is sufficient to pursue these experiments. In case this is unfeasible, we will include the solubility analysis in our revised version of the article, as well as a discussion on this topic – and also on the topic of why the activity of the ‘monomorphic’ proteins resembles the E48A mutant of E. coli RfaH that co-exists between two folds – as indicated in our response to the major comment from reviewer #1.

      1. Only nine homologs were tested - three per category. The conclusions about monomorphic behavior generalizing across the whole βRfaH clade are basically resting on three proteins. Bringing in even one or two phylogenetically distant βRfaH candidates would help guard against the possibility that what they're seeing is just a genus-specific quirk. If new experiments aren't on the table, the limitation should at least be called out explicitly in the Discussion.

      We agree with the reviewer that drawing conclusions from a single clade of RfaH could raise concerns about bias, although we must note that the tested putative monomorphic candidates were selected before a phylogenetic tree was constructed. What we propose is to perform a phylogenetic analysis for the InterPro sequences and look at their genomic neighborhood as well, replicating what was done in the manuscript for the Genomic Cluster group. We hope this would provide more compelling evidence that the predictions, phylogeny and gene organization of these extant monomorphic RfaH is distinct from those metamorphic.

      1. The classification thresholds (α > 32.5% / β 30.0% / α

      Thanks to the reviewer for raising this concern. We will perform a sensitivity analysis by slightly nudging the cutoffs by ±5% as recommended by the reviewer and indeed we see minimal changes in the number of structures in each class. We have added a small paragraph indicating this sensitivity test:

      “To determine that these values were adequate for our analysis, we performed a sensitivity test by changing the thresholds by ±5% over the data for all structures predicted from all databases, showing that the predictions of RfaH orthologs with monomorphic CTD and mixed secondary structure in their CTD is robust, and only metamorphic RfaH orthologs were reduced with an increase in uncategorized structures (Supplementary Figure S13)”

      1. The Discussion notes that uncontrolled, ops-independent RfaH recruitment could be lethal, since RfaH outcompetes the much more abundant NusG. But if monomorphic RfaH proteins really are extant and stably maintained in these genomes, there has to be something keeping them from interfering with NusG's essential functions - maybe very low expression, restricted induction, or compensating differences in NusG affinity. The paper would benefit from tackling this directly, even speculatively.

      We agree with the reviewer in this point, and after careful consideration we believe that we did not emphasize this point appropriately in the manuscript. In fact, we included Figure 7 to state our perspective on how RfaH may have evolved but we did not emphasize how this perspective stems from a previous work that we thoroughly discussed in the introduction (doi: 10.1038/emboj.2008.268) and that explicitly states that low solubility of the dissociated NTD and CTD could be a factor imposing this restricted action in cis operons. We have included this in our revised version of the manuscript as follows:

      “Our findings are in line with the previous hypothesis regarding the emergence of RfaH within the universally conserved family of NusG transcription factors (Belogurov et al, 2009). Under that model, a gene duplication event produced an intermediate variant (NusG2 in Figure 7) that lost its Rho-binding capability and acquired a deletion in the NTD that reduced the protein's overall size and remodeled its hydrophobic profile. Crucially, this intermediate retained an exposed, hydrophobic RNAP-binding region, a feature shared by monomorphic RfaH and the ancestor of all RfaH orthologs (NusGSP in Figure 7). This increased hydrophobicity would have reduced solubility, restricting its regulatory activity to the site of synthesis, i.e. in cis. Indeed, when structural alignment is used to identify conserved NTD residues that bind to RNAP, orthologs contain more than 70% hydrophobic residues (Supplementary Figure S11) at those positions. This percentage is much closer to that of RfaH (80%) than NusG (57.14%). The protein only regains solubility and the ability to operate in trans when its CTD refolds into a helical conformation. Ultimately, our results strengthen this evolutionary model by demonstrating that several extant RfaH orthologs appear to resemble this insoluble, cis-acting ancestral state.”

      Minor comments:

      Table 1 should show percentages alongside the raw counts. 7/7 LPS-in-operon for monomorphic candidates is striking, but with n=10, the small denominator really deserves to be flagged.

      We agree with the reviewer in that the higher raw count of metamorphics may undersell the message the article conveys. We added the percentages next to raw counts in Table 1, regarding “Total” and “Next to operon” categories. We also modified the legend as follows:

      “A summary of genomic contexts of RfaH orthologs classified according to the AF2 predictions. The numbers indicate how many rfaH genes are next to an operon and whether the operon contains lipopolysaccharide biosynthesis genes, and the percentages next to them display the relation to the previous category, i.e, “Next to operon”/”Total” and “LPS in operon”/”Next to operon”.”

      In Figure 3, the sequence logo on top is informative - consider adding the number of sequences per dataset to the axis labels so readers can interpret the boxplot widths.

      We believe that this would be rather confusing for the readers, because it is counting all 5 structures predicted by AlphaFold2 for each sequence in each dataset that fit each classification, and thus the same sequence can lead to structures that are monomorphic, metamorphic of have mixed secondary structure in their CTD. Thus, the number of sequences per box plot will be higher than the number of sequences per dataset. For example, one sequence from InterPro can be present in more than one box plot, because different AlphaFold2 models can lead to the prediction of different states from the same sequence. We believe it is less confusing if it is presented as it is.

      There's some redundancy between the Results (pp. 14-17) and the Discussion that could probably be trimmed, particularly the recap of the ops/RBS construct logic.

      Thanks for the recommendation. We reduced this redundancy in the new version of the manuscript, mainly on page 16:

      “The orthologs classified as monomorphic, and thus expected to be constitutively active, exhibited activity across all tested ops contexts, including in the absence of RBS (Figure 5B-F). Notably, their activity levels were comparable to the ops-independent E. coli RfaH E48A mutant, in which the key salt bridge at the NTD:CTD interface is disrupted. All monomorphic orthologs were found to lack a few key residues that make contacts to ops DNA in RfaH, as well as the conserved residues in loop 2 that mediate contacts with Rho in NusG (Supplementary Figure 11). This mosaic architecture enables these orthologs to promote the expression of the long lux operon even when the RBS is absent. Our study provides the first indirect evidence of putative, constitutively active RfaH proteins, which are predicted to have monomorphic NusG-like fold, in other bacteria.”

      Reviewer #2 (Significance (Required)):

      If the central claim holds up, this is a meaningful contribution to the metamorphic-protein and bacterial-transcription literatures: it identifies what appear to be extant evolutionary "way-stations" in the NusG→RfaH transition, and it does so using a tractable computational pipeline that could be applied to other suspected fold-switch families. The work is timely given the ongoing discussion about how AF2 and its descendants handle conformational heterogeneity. With the strengthening suggested above - particularly any direct biophysical confirmation of a monomorphic candidate - I would expect this to be a well-cited paper in its niche.

      We are very thankful for the reviewer’s comments on our manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript by Tabilo-Agurto et al. uses in silico and experimental methods to elucidate the diversity of the metamorphic RfaH protein family. Of particular note is the sophisticated usage of AlphaFold2 to reconstruct the evolutionary tree of RfaH as well as the in vivo luminescence assays to substantiate the different structural states of the RfaH-CTD. Overall this is a well-written manuscript providing deeper insight into the structural and functional diversity of RfaH proteins, potentially relevant for other metamorphic proteins as well.

      Minor comments:

      1. 3rd paragraph of the introduction: The sentence starting with "To date, ...and nuclear magnetic resonance of these ancestors.." seems incomplete as this reviewer believes the author´s wanted to say "..and structural characterization by nuclear magnetic resonance spectroscopy of these ancestors..."

      Thanks for the attention to these details, we will amend this paragraph appropriately.

      1. 4th paragraph of the introduction: "..., that binds Rho or the ribosome (Mooney et al. 2009b). Whereas this citation is correct for NusG-Rho interactions it does not indicate ribosome binding. The direct interaction of NusG with the ribosome was shown in Burmann et al. Science 2010 and this reference should be added here.

      Thanks for the recommendation, we will include both citations in this section of the manuscript.

      1. More a curiosity question, did the author also test for a subset of the RfaH variants the AlphaFold3 predictions and obtain similar or different results?

      Thanks for the comment. The reason we did not use AlphaFold3 predictions to check on the variability of the results is that there is much more known about the use of AlphaFold2 – and its limitations – regarding their use in the study of metamorphic proteins, whereas a deep understanding of the advantages and limitations of AlphaFold3 for studying metamorphic proteins is still under development.

      Referees cross-commenting:

      Overall there is an agreement among all reviewers that the present MS is an interesting and timely study. The point raised by reviewer 2 to add simple biophysical characterization, if feasible, would be clearly an excellent addition and likely make the MS stronger. In general all three reviewers mainly point to minor changes and additions to improve the MS in a rather short timeframe.

      We indeed agree with this comment, which is why we will commit to attempt the recombinant expression and protein purification of the RfaH orthologs and to perform circular dichroism assays if the solubility of the obtained proteins allows for such experiments to be done.

      Reviewer #3 (Significance (Required)):

      The present MS is an interesting large-scale usage of the AlphaFold2 algorithm to reconstruct the evolutionary tree of the specialized transcription elongation factor RfaH. Revealing a different degree of this evolution in a diverse set of bacterial strains indicating its evolutionary distance from the cognate NusG transcription elongation factor. Of particular note is the experimental verification of the obtained in silico finding by in vivo luminescence approaches.

      We are very thankful for the reviewer’s comments on our manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Tabilo-Agurto et al. uses a in silico and experimental methods to elucidate the diversity of the metamorphic RfaH protein family. Of particular note is the sophisticated usage of AlphaFold2 to reconstruct the evolutionary tree of RfaH as well as the in vivo luminescence assays to substantiate the different structural states of the RfaH-CTD. Overall this is a well-written manuscript providing deeper insight into the structural and functional diversity of RfaH proteins, potentially relevant for other metamorphic proteins as well.

      Minor Points:

      • 3rd paragraph of the introduction: The sentence starting with "To date, ...and nuclear magnetic resonance of these ancestors.." seems incomplete as this reviewer believes the author´s wanted to say "..and structural characterization by nuclear magnetic resonance spectroscopy of these ancestors..."
      • 4th paragraph of the introduction: "..., that binds Rho or the ribosome (Mooney et al. 2009b). Whereas this citation is correct for NusG-Rho interactions it does not indicate ribosome binding. The direct interaction of NusG with the ribosome was shown in Burmann et al. Science 2010 and this reference should be added here.
      • More a curiosity question, did the author also tested for a subset of the RfaH variants the AlphaFold3 predictions and obtained similar of different results?

      Referees cross commenting

      Overall there is an agreement among all reviewers that the present MS is an interesting and timely study. The point raised by reviewer 2 to add simple biophysical characterization, if feasible, would be clearly an excellent addition and likely make the MS stronger. In general all three reviewers mainly point to minor changes and additions to improve the MS in a rahter short timeframe.

      Significance

      The present MS is an intereting large scale uage of the AlphaFold2 algorithm to reconstruct the evolutionary tree of the specialized transcription elongation factir RfaH. Revealing a different degree of this evolution in a diverse set of bacterial strains indicating its evolutionary distance from the cognate NusG transcription elongation factor. Of particular note is the experimental verification of the obtained in silico finding by in vio luminescence approaches.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their paper "Exploration of the structural and functional diversity in the metamorphic RfaH subfamily," Tabilo-Agurto et al. use AlphaFold2 to predict the structures of ~3,900 RfaH homologs, sort the predicted C-terminal domains into α-helical (autoinhibited), β-barrel (NusG-like), and mixed α/β topologies, and find that about 14% of homologs come out predominantly in the β-barrel state. They then take nine representative homologs and run them through a heterologous E. coli DH5α ΔrfaH reporter assay. The putative monomorphic candidates behave a lot like the constitutively active E48A variant - active across every ops context and even without an RBS - while the mixed α/β candidates barely show activity. Targeted MD simulations of E. coli RfaH, run through AF2Rank, also pulls out the mixed α/β state as its own distinct cluster, hinting that it sits somewhere along the fold-switching transition path.

      This is a genuinely interesting piece of work that pulls together structure prediction, in vivo activity, and genomic context to make a concrete case for extant monomorphic βRfaH proteins - a long-hypothesized but until now unseen intermediate in the proposed stepwise evolution of RfaH from NusG. The experimental design is thoughtful, especially the five-construct ops/RBS matrix, and comparing the monomorphic candidates against the E48A benchmark is a nice touch as a positive control. Overall, I think the paper deserves to be published, but a few things would need shoring up before acceptance.

      Major comments

      1. The paper would be a lot stronger with at least one biophysical measurement (a CD spectrum, say) on a purified monomorphic candidate. I get that this might be outside the planned scope, but even a single CD trace showing β-rich content for an isolated full-length protein would move the claim from "putative" to "demonstrated."
      2. Only nine homologs were tested - three per category. The conclusions about monomorphic behavior generalizing across the whole βRfaH clade are basically resting on three proteins. Bringing in even one or two phylogenetically distant βRfaH candidates would help guard against the possibility that what they're seeing is just a genus-specific quirk. If new experiments aren't on the table, the limitation should at least be called out explicitly in the Discussion.
      3. The classification thresholds (α > 32.5% / β < 2.5% for αRfaH; β > 30.0% / α < 2.5% for βRfaH) are described as coming from histogram inspection, but they feel a bit arbitrary as stated. A quick sensitivity analysis - how do the population fractions shift if you nudge the cutoffs {plus minus}5%? - would help reassure the reader.
      4. The Discussion notes that uncontrolled, ops-independent RfaH recruitment could be lethal, since RfaH outcompetes the much more abundant NusG. But if monomorphic RfaH proteins really are extant and stably maintained in these genomes, there has to be something keeping them from interfering with NusG's essential functions - maybe very low expression, restricted induction, or compensating differences in NusG affinity. The paper would benefit from tackling this directly, even speculatively.

      Minor comments

      Table 1 should show percentages alongside the raw counts. 7/7 LPS-in-operon for monomorphic candidates is striking, but with n=10, the small denominator really deserves to be flagged.

      In Figure 3, the sequence logo on top is informative - consider adding the number of sequences per dataset to the axis labels so readers can interpret the boxplot widths.

      There's some redundancy between the Results (pp. 14-17) and the Discussion that could probably be trimmed, particularly the recap of the ops/RBS construct logic.

      Significance

      If the central claim holds up, this is a meaningful contribution to the metamorphic-protein and bacterial-transcription literatures: it identifies what appear to be extant evolutionary "way-stations" in the NusG→RfaH transition, and it does so using a tractable computational pipeline that could be applied to other suspected fold-switch families. The work is timely given the ongoing discussion about how AF2 and its descendants handle conformational heterogeneity. With the strengthening suggested above - particularly any direct biophysical confirmation of a monomorphic candidate - I would expect this to be a well-cited paper in its niche.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This is an interesting and ambitious study by Tabilo-Agurto and co-workers. It combines deep learning structure prediction (AlphaFold2), targeted molecular dynamics simulations, and in vivo functional assays to probe structural, functional, and evolutionary aspects of the metamorphic protein RfaH. More broadly, the work addresses an important question: whether intermediate structural states may exist along evolutionary trajectories of metamorphic proteins. A particular strength of the study is the integration of computational and experimental approaches. The manuscript is generally well written and clearly organized.

      Major comment:

      A key aspect of the study is the classification of predicted structures into three classes based on the conformation of the C-terminal domain (CTD): the autoinhibited alpha-helical fold, a beta-barrel fold, and a mixed alpha/beta fold. These classes are further described as corresponding to metamorphic (alpha fold), mixed alpha/beta, and monomorphic (beta fold) proteins.

      While I can see how this organizational scheme is helpful in some respects, it may also overstate what can be concluded from the data. As the authors are well aware, AlphaFold2 tends to predict a single conformation even for genuine metamorphic proteins, and therefore does not, on its own, distinguish between monomorphic and fold-switching proteins. I note in particular that the functional data indicates that the "monomorphic" variants studied in the in vivo assays behave similarly to the RfaH E48A mutant. However, E48A is known to remain metamorphic, populating both alpha and beta folds with roughly equal probability. This suggests that the sequences in this class may retain some degree of fold-switching capability, even if the underlying regulatory mechanism differ from that of wild-type RfaH. In other words, the presented data does not fully support these sequences as monomorphic. I am not suggesting that the authors must revise their classification scheme. However, it may strengthen the manuscript if the authors explicitly acknowledge this alternative interpretation and moderate the corresponding claims.

      Significance

      An intriguing, but speculative, aspect of the study is the finding that some sequences are predicted to adopt a CTD with mixed alpha/beta secondary structure, and that such structures also appear in targeted molecular dynamics simulations. If this idea holds up, it could represent an intermediate along the evolutionary pathway between the alpha-helical and beta-barrel folds of RfaH. Although the evidence is only computational, it is a compelling idea and it would benefit from further investigation.

      In summary, the work is a valuable contribution to the field of protein fold switching. The combination of computational tools with experimental validation makes it interesting and the results should be of broad interest. The manuscript should be well positioned for publication in a high-impact journal.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2026-03474

      Corresponding author(s): Priyanka, Verma

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      • *

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      Point-by-point rebuttal is presented below. Reviewer’s comments are in BLACK; author’s response is in BLUE and figure numbers corresponding to the manuscript are in RED.

      2. Point-by-point description of the revisions

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      ALC1 suppression has been shown to potentiate PARP inhibitor lethality in HR-deficient cells. Rather than revisiting the underlying mechanism, which has been characterized and remains an active area of investigation, this study aims to define the clinical contexts in which combined ALC1 and PARP inhibition may be beneficial. The clinical efficacy of PARP inhibitors, and their FDA approval, is largely restricted to HR-deficient tumors. This study dissects the combined effects of ALC1 and PARP suppression across a panel of HRD ovarian cancer cell lines, multiple classes of PARP inhibitor, and cells harboring distinct PARPi resistance mechanisms. In doing so, the authors delineate both the potential utility and the limitations of combined ALC1 and PARP inhibitor treatment in HRD ovarian cancers. The most impactful finding of the study, however, is likely the demonstration that ALC1 suppression sensitizes HR-proficient, CCNE1-amplified high-grade serous ovarian cancers to PARP inhibitors. These tumors are associated with particularly poor outcomes owing to the current absence of effective targeted therapies, making this observation of considerable clinical relevance.

      We thank the reviewer for appreciating the significance of our work in “HR-proficient, CCNE1-amplified high-grade serous ovarian cancers to PARP inhibitors” which is a critical unmet need.

      Of note, the study relies on genetic rather than pharmacological depletion of ALC1, a choice likely reflecting the current lack of a commercially available ALC1 inhibitor. While genetic suppression may not fully recapitulate the effects of combined drug treatment, it offers the advantage of not being tied to any specific compound, allowing the authors to establish more general principles. I have only a few comments.

      We are grateful to the reviewer for providing the unique perspective on our genetic study that “it offers the advantage of not being tied to any specific compound, allowing the authors to establish more general principles.”

      We have included this in our discussion to strengthen the study.

      The effect of ALC1 KO on PARPi sensitivity is less pronounced in OVSAHO cells (BRCA2-mutated) than in BRCA1-mutated cells. In these cells, it looks like there is an additive effect rather than synergy. 1- The authors should calculate, if possible, whether there is synergy or additive effect of ALC1-KO lethality (BLISS).

      We thank the reviewer for recognizing our limitations to perform BLISS score analysis, as our experiments were conducted at a single level of total protein depletion. Ideally, synergy assessments require a range of depletion levels to generate a full response matrix. Regardless, to address the reviewer’s concern regarding the impact of ALC1 on olaparib response in BRCA1- and BRCA2-mutant cells, we performed a BLISS score calculation under the conservative assumption that total ALC1 depletion alone has no effect on cell viability. We then employed the following formula for BLISS score calculation:

      Bliss Score =Eobs- (EA+EB-EAX EB)

      Where Eobs is viability of ALC1-depleted cells at a given drug concentration. This is observed impact upon combined loss of ALC1 and olaparib treatment.

      EA is impact on viability upon ALC1 depletion only. This was considered to be zero.

      EB is impact on viability on ALC1 WT in the presence of drug. This assesses the impact of drug alone.

      BLISS score was calculated at all non-saturating drugs concentration and then averaged to obtain a final BLISS value. We used the following cut off:

      > 10: Synergistic (the interaction is considered significant);

      -10 to 10: Additive (no significant interaction);

      __

      Olaparib

      Rucaparib

      Niraparib

      Veliparib

      Cisplatin

      UWB1.289

      22.34

      25.21

      13.24

      14.95334

      0.26

      JHOS-4

      37.27

      47.14

      26.3

      27.94

      -0.37

      OVSAHO

      19.34

      27.6

      23.2

      19.15

      7.04

      Kuramochi

      11.38

      11.98

      -3.56

      6.79

      -0.39

      We observe that ALC1 loss synergistically enhances olaparib and rucaparib response in both BRCA1- and 2-mutant cells. However, as correctly noted by the reviewer, we notice that the BLISS score is higher in BRCA1-mutant cells compared to BRCA-2 mutant, OVSAHO.

      In the revised manuscript, we have also included data for another BRCA-2-mutant cell line: KURAMOCHI (Fig.1d; Supp. Fig1b). We chose this cell line because, despite having a BRCA2-mutation, it is highly resistant to PARP inhibitors and cisplatin, owing to KRAS amplification. Notably, we observe that ALC1 loss can synergistically enhance the response of Kuramochi to olaparib and rucaparib.

      We have included a statement in the manuscript that the impact of ALC1 loss was more profound in BRCA1- versus BRCA2-settings. However, if acceptable to the reviewer, we would prefer not to include the BLISS values in the manuscript, as these calculations were not performed using the standard approach of titrating multiple levels of protein depletion.

      2- Another BRCA2-mutated cell line should be included.

      As discussed above, we have now included data from another BRCA2-mutant cell line, Kuramochi. Consistent with data in other BRCA-mutant cell lines, loss of ALC1 enhances olaparib and rucaparib sensitivity in these cells (Fig. 1d; Supp. Fig.1b).

      Minor comments: • Figure key is missing for S2C (I assume it's grey DMSO, blue olaparib)

      We apologize for this oversight. Figure key has now been included.

      • Page 8: "BRCA1-mutant ovarian cancer cells eventually develop chemoresistance when exposed to PARPi for a prolonged period. Mechanistically, this is due to rewiring of ATR signaling, which enables RAD51 loading at DNA breaks and reversed forks independent of BRCA1 protein(25)." This sentence suggest this is the only existing resistance mechanism, which should be correct. Modify to "mechanistically, this CAN be due to", or "this is OFTEN due to".

      We thank for the reviewer for suggesting this important correction. This has now been fixed.

      Reviewer #1 (Significance (Required)):

      ALC1 inhibitors have been developed and clinical trials are starting. The significance of this manuscript lies in establishing the clinical potential for combined ALC1-PARP inhibition in high grade serous ovarian cancer. Especially, the authors demonstrate that combined ALC1 suppression with PARP inhibition efficiently kills HR-proficient CCNE1-amplified ovarian cancers, which represent 20% of ovarian cancers and are resistant to current therapies.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ The manuscript by Lindsey et al. explores the role of ALCN1 (Amplified in Liver Cancer 1) loss in enhancing the sensitivity of PARPi in ovariar carcinomas, including BRCA1/2 mutated tumors (both sensitive and resistant to platinum) as well as cyclin E amplified settings. The data are interesting but the in some cases there is an overinterpretation of the results. I have listed below my major concerns.

      We appreciate that the reviewer finds our data interesting. We also appreciate the reviewer insightful comments and have addressed them below.

      Figure 1. Could the authors demonstrate that OVASAHO cells are BRC2 muted? Indeed, I have always though they were BRCA wt type (10.1016/j.ygyno.2015.08.017).

      OVSAHO cells have a homozygous deletion in the BRCA2 gene (PMID:23839242), which could be the reason why a mutation was not detected in the study referred to by the reviewer (PMID: 26321251). We have now included the Domcke et al; 2013 reference in manuscript. The loss of BRCA2 expression in OVSAHO is also evident in our blots (Fig. 1a), as well as in data from protein atlas analysis.

      While the data on cisplatin suggest that indeed ALC1 loss do not impact its sensitivity, I disagree with the statant that "the correlation between dispensability of ALC1 in platinum response suggests that this chromatin remodeler likely does not contribute to MMEJ (page 6)" or " is dispensable for HR (page 7). Indeed, it is has to be stressed that cisplatin induced DNA damage (interstrand crosslinks) are substrates also for nucleotide excision repair, that has a key role in repairing these lesions.

      We agree with the reviewer that transcription-coupled NER is the key pathway for the resolution of cisplatin-induced damage. We therefore have revised this statement in the manuscript as “Our data showing the dispensability of ALC1 in cisplatin response, both in BRCA1 and 2-mutant settings, is consistent with previous reports demonstrating the dispensability of this remodeler for MMEJ or transcription-coupled nucleotide excision repair.” We have cited previous work where ALC1 has been shown to be dispensable for MMEJ or TC-NER. Similarly, we have modified the text on page 7 as “Furthermore, ALC1 loss did not impact sensitivity to cisplatin in HRP cyclin E1-high cells. This observation is consistent with previous studies showing its dispensability for HR repair.”

      Figure 2. Please explain better why niraparib is not active in cyclinE1-high cells.

      Our comprehensive studies examining the impact of ALC1 depletion on PARPi response uncover the generalized theme that targeting is most effective in enhancing sensitivity of olaparib and rucaparib, which have moderate PARP1/2 trapping ability, as compared to niraparib and talazoparib, which are strong trappers. One possible explanation could be that moderate PARP1/2 trappers are more amenable for combination strategies because their effects do not reach full saturation, preserving a dynamic range that allows for additive or synergistic enhancement. This was included in the discussion section of the manuscript.

      It is not clear to me if the authors consider a cyclin E "gain" an overexpressing tumor (i.e. OVCAR8). The authors need to show the response to PARPi in one (possibly two) cell lines with very low expression of cyclin E and knock-down of ALC1.

      We have present data in multiple BRCA1-WT cell lines with very low expression of cyclin E compared to OVCAR8. These include: FT282 cell line (Fig. 4), two FT282 clones of BRCA1-/+ FT cells (Fig. 5), and full length BRCA1 addback UWB1.289 (Fig. 3c). Additionally, we have added immunoblotting data showing that in OVCAR8, the level of cyclin E1 protein and activity as assessed by pCdk2 is comparable to OVCAR3 and OVCAR4, two CCNE1-amplified lines (Fig. S2d). In contrast, FT 282 and UWB1.289 BRCA1 add back cells have low levels of cyclin E and thus low pCdk2.

      The deletion of ALC1 do interfere with tumor take and tumor growth? No clear is the in vivo experiments.

      Tumor uptake: We injected OVCAR8 cells in mice three days post-transduction of sgALC1. Depletion of ALC1 is only achieved at 14 days post transduction. This explains why tumor uptake is not impacted. We do not observe a significant impact of ALC1 loss on tumors derived from OVCAR8 cells. This is consistent with the dispensability of ALC1 in the proliferation of HR-proficient cells (PMID: 33333017; PMID: 33462394). We have added text in the manuscript to clarify this point.

      Injecting OVCAR8 cells in the peritoneum is not associated with the formation of ascites?

      We thank the reviewer to bring up this important point. The objective of this study is to examine how ALC1 loss can enhance PARPi responses and therefore we chose an earlier time point (~50 days) to assess the impact on tumor growth. Ascites formation upon intraperitoneal injection of OVCAR8 cells has primarily been reported at late stages of disease development. For example, Anirban Mitra et al. (2015) (PMID: 26050922) reported consistent ascites formation, but only at extended timepoints (up to ~90 days post-injection). Similarly, Yong-Tae Shen et al. (2019) (PMID: 31117198) injected 5-10 x106 cells and observed ascites emergence beginning around day 49, with progressive accumulation toward the endpoint, indicating that fluid buildup coincides with advanced peritoneal dissemination. In contrast, studies using comparable inoculation doses (e.g., 1×10⁶ cells) and shorter observation periods (~6 weeks) such as Luis Hernandez et al. (2016) (PMID: 27235858) did not report detectable ascites. Taken together, these findings suggest that, while OVCAR8 cells can generate ascites, this phenotype typically manifests at later stages of disease progression and is not expected within shorter experimental windows. Therefore, the absence of ascites in our model is consistent with the study design and timeframe, rather than indicative of a failure of tumor establishment.

      We have added relevant discussion in the results section to clarify this point.

      How was tumor weight calculated?

      Tumor burden was quantified by direct collection and measurement of peritoneal tumor nodules. For the sacrificed mice, all visible tumor nodules within the peritoneal cavity were carefully excised, counted, and pooled per animal. The total tumor weight was then determined by weighing the combined mass of all collected nodules using an analytical balance. Thus, “tumor weight” represents the cumulative mass of macroscopic peritoneal implants per mouse. No estimations or indirect calculations were used. This has now been elaborated on in the methods section.

      It seems that tumors grow as solid mass, but how were nodulesAll mice at endpoint exhibited disseminated peritoneal disease, characterized by multiple tumor nodules and invasion into the peritoneal wall. Tumor nodules were quantified by direct visual inspection during necropsy. Small nodules ( Why survival curves were not shown?

      Survival analysis was not included because the study was designed with a predefined experimental endpoint to enable controlled comparison of tumor burden across groups. Animals were therefore euthanized at the same timepoint rather than followed longitudinally to survival. As a result, Kaplan–Meier analysis was not applicable to this experimental design. We agree that survival is an important outcome and would be valuable in future studies specifically powered and designed for that purpose.

      The dose of 50mgr/kg every third day is a very low olaparib dose. Generally the in vivo dosing is 100mgr/kg , 5 days a week for 4 weeks (doi: 10.1158/1535-7163.MCT-21-0420; 10.1158/2767-9764.CRC-22-0423).

      We agree that higher doses of olaparib (e.g., 100 mg/kg, 5 days/week) are commonly used and have demonstrated single-agent efficacy in vivo. In this study, however, our objective was to specifically evaluate the combinatorial effect of olaparib with genetic knock-out of ALC1. To enable this, we intentionally employed a reduced dosing regimen (50 mg/kg every third day) to minimize single-agent activity. This approach allowed us to establish a condition in which olaparib in sgAAVS1 control tumors had limited impact on tumor burden, thereby providing a dynamic range in which to detect potential sensitization effects mediated by sgALC1. Using a fully efficacious dose would likely mask such interactions by producing a near-maximal response in the control group. Thus, the selected dosing strategy reflects a deliberate experimental design to assess potentiation effects rather than to model maximal therapeutic efficacy of olaparib as a monotherapy.

      Figure 4. I could not find the data of the minimal impact of ALC1 in UWB1.289 cells. What the author refer to? They refer to the fact that ALC1 deletion di not cause any cell growth alteration or to something else? But were there the data?

      The minimal impact being referred to was PARPi responses in BRCA1-proficient UWB1.289. We have now fixed the statement to read: “The minimal impact of ALC1 in BRCA1-proficient UWB1.289 cells on PARPi responses suggested that targeting this remodeler may have minimal impact on normal healthy cells.” and included the relevant figure number (Fig.3c) for clarity.

      The modest increment in pRPA in hTER-FT282 is statistically significant and not very different from what observed in UWB.289, suggesting that ACL1 deletion could indeed impact normal cells. These data should be interpreted more conservatively.

      The increase in pRPA levels upon ALC1 loss in hTERT FT282 BRCA1 het cells and UWB1.289 cells is 1.2 and 1.4 respectively. This is consistent with the literature that BRCA1-/+ het cells have compromised replication stress response. Unresolved replication stress gets processed into double-strand breaks (DSBs). Consistent with the proficiency of hTERT FT282 BRCA1-/+ het cells in DSBs repair, ALC1 deficiency does not increase yh2ax in these cells. Hence, despite an increase in pRPAS33 signal in hTERT FT282 BRCA1 het cells, these cells can resolve downstream breaks. In contrast, a profound, 1.7-fold increase in yh2ax signal was observed upon ALC1 loss in BRCA-mutant UWB1.289 cells, reinforcing that ALC1 loss has a more profound response in BRCA-mutant cancer cells.

      To align with the reviewer’s suggestion, we have removed the word “modest’ and have retained the fold differences in the median values.

      Figure 6. Questionable is the OS as endpoint in this heterogeneous patient population (treated in front line and recurrent) and in my opionion OS, much more than PFS, is influences by the many different treatment these patients underwent and that could influence the OS. Why not considering PFS after/or on PARPi treatment? The authors should clarify the patient population, Indeed, 48 patients were treated with PARPI and were platinum sensitive and possibly HRD. What patients are the HPR patients? How many were they? It is not clear the HRP and high replication stress cohort were treated with PARPi? How many of these were Cyclin E amplified or with high levels? Figure 6F should also include, beside UVB+BRCA1, other tumor cells with no Cyclin E overexpression and non BRCA mutation or HRD. The discussion of limitations should be addressed to strengthen the manuscript.

      We thank the reviewer and agree that PFS is often preferred for evaluating treatment-specific effects. However, in this cohort, PFS was not a reliable endpoint for several reasons. Tumor samples were obtained at diagnosis, whereas PARPi was administered later, in either the frontline maintenance or recurrent setting, introducing temporal and prognostic heterogeneity that limits the interpretability of PFS. These factors confound attribution of PFS specifically to PARPi response. We therefore selected OS from the time of PARPi exposure as a more consistently defined endpoint across this heterogeneous cohort, while acknowledging its limitations.

      Reviewer #2 (Significance (Required)):

      The manuscript by Lindsey et al. explores the role of ALCN1 (Amplified in Liver Cancer 1) loss in enhancing the sensitivity of PARPi in ovarian carcinomas, including BRCA1/2 mutated tumors (both sensitive and resistant to platinum) as well as cyclin E amplified settings. The data are interesting but the in some cases there is an overinterpretation of the results.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ The manuscript by Aubuchon, Wong et al. presents strong insights into the value of ALC1 as novel target for sensitization strategies against PARPi. The authors show that a PARPi resistance is reversible when ALC1 is knocked down and convincingly highlight the genetic circumstances for these approaches. Also, the authors point out that especially the weak PARP-trappers olaparib and rucaparib could benefit from concomitant ALC1 inhibiton and high levels of replication stress by elevated p-T21 RPA2 could serve as biomarker in clinical settings. Furthermore, the authors show that benign fallopian tube cells are not affected by ALC1-kd, which is an important finding for in vivo approaches.

      We thank the reviewer for acknowledging that our work provides “strong insights” and makes “important finding for in vivo approaches”.

      As the manuscript covers a broad experimental field, I would only suggest a few additional experiments to further strengthen the overall story:

      1. How does an ALC1 knock-down affect the expression of PARP1 and if so, how does this contribute to the effects seen by ALC1-kd? The authors could add Western Blot experiments for cell lines belonging to the respective groups that are distinguished in the manuscript: BRCA wt, BRCA mutated and Cyclin E1-high cancer cells and also a benign fallopian tube cell line.

      This was an interesting point brought up by the reviewer. To address this, we examined and compared total PARP1 protein levels in BRCA1 add-back UWB1.289, BRCA1-mutant UWB1.289, cyclin E1-high OVCAR8, and FT282, between ALC1 WT and depleted cells. However, we do not observe any consistent alteration in PARP1 level upon ALC1 depletion (Fig. Supp. Fig. 6a, b).

      In some of the Western Blot data, it also looks like BRCA1 expression is affected by ALC1 kd. The authors could provide some quantified protein expression or qPCR data if there is a correlation between both expressions.

      To address the reviewer’s question, we quantified changes in BRCA1 levels upon ALC1 loss across all cell lines used in this study. As expected, BRCA1 levels were higher in UWB del 11q and Cyclin E1-overexpressing cell lines. In contrast, cell lines harboring heterozygous BRCA1 mutations or BRCA1 promoter methylation were among those with the lowest BRCA1 expression. This trend provides us confidence in reliably quantifying our immunoblotting data. Although minor fluctuations in BRCA1 protein levels were observed following ALC1 depletion, no consistent trend towards either an increase or decrease was evident (Fig. Supp. Fig. 6c). Likewise, when cell lines were grouped according to their sensitivity to PARP inhibition upon ALC1 loss, no clear pattern emerged (Fig. Supp. Fig. 6d). Together, these data suggest that ALC1 depletion does not substantially affect BRCA1 protein levels, consistent with our previous RNA-seq and functional studies indicating that this chromatin remodeler is dispensable for transcriptional regulation or homologous recombination (PMID: 33462394).

      To further strengthen the hypothesis that the effects of strong PARP-trappers are not improved by ALC1 kd, the authors should add data regarding the viability of the cells presented in Figure 3b upon treatment with niraparib and talazoparib in sgALC1 cells (versus vector control). Also, the authors should add cell viability data using talazoparib for the sgALC1 OVCAR cell lines (versus vector control) in Figure 2 and Supplement Figure 3.

      Sensitivity to niraparib and talazoparib upon ALC1 depletion have now been added in Figure 3b, and for OVCAR lines in Supplement Figure 3. As correctly pointed by the reviewer, we consistently observe that impact of ALC1 loss is more profound on olaparib and rucaparib compared to niraparib and talazoparib.

      Some minor points I noticed while reading the manuscript:

      We apologize for the oversight and thank the review for pointing this out.

      • in Figure 3b, both graphs have the same title. I think the right one should be "SYr14" instead of "SYr12" again

      Fixed. - In the heading of Figure 2 an "in" is missing

      Fixed.

      • There are some citations, that seem to be made with another citation style (superscript numbers) than numbers in brackets across the manuscript.

      Fixed.

      Reviewer #3 (Significance (Required)):

      The most important aspect resulting from this manuscript is that ALC1 inhbitors could improve the response to some PARPi without damaging healthy cells. Thereby, the authors also mention the limitation of the use of ALC1 as a target and offer a potential biomarker for combinatory approaches. This study offers a very detailed insight into the potential role of ALC1 as a target for sensitization approaches under the different genetic conditions that can occur in HGSOC. These novel insights contribute to further broaden the therapeutic options by PARPi in clinical settings if the results can be approved by in vivo trials.

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Aubuchon, Wong et al. presents strong insights into the value of ALC1 as novel target for sensitization strategies against PARPi. The authors show that a PARPi resistance is reversible when ALC1 is knocked down and convincingly highlight the genetic circumstances for these approaches. Also, the authors point out that especially the weak PARP-trappers olaparib and rucaparib could benefit from concomitant ALC1 inhibiton and high levels of replication stress by elevated p-T21 RPA2 could serve as biomarker in clinical settings. Furthermore, the authors show that benign fallopian tube cells are not affected by ALC1-kd, which is an important finding for in vivo approaches.

      As the manuscript covers a broad experimental field, I would only suggest a few additional experiments to further strengthen the overall story:

      1. How does an ALC1 knock-down affect the expression of PARP1 and if so, how does this contribute to the effects seen by ALC1-kd? The authors could add Western Blot experiments for cell lines belonging to the respective groups that are distinguihed in the manuscript: BRCA wt, BRCA mutated and Cyclin E1-high cancer cells and also a benign fallopian tube cell line.
      2. In some of the Western Blot data, it also looks like BRCA1 expression is affected by ALC1 kd. The authors could provide some quantified protein expression or qPCR data if there is a correlation between both expressions.
      3. To further strengthen the hypothesis that the effects of strong PARP-trappers are not improved by ALC1 kd, the authors should add data regarding the viability of the cells presented in Figure 3b upon treatment with niraparib and talazoparib in sgALC1 cells (versus vector control). Also, the authors should add cell viability data using talazoparib for the sgALC1 OVCAR cell lines (versus vector control) in Figure 2 and Supplement Figure 3.

      Some minor points I noticed while reading the manuscript:

      • in Figure 3b, both graphs have the same title. I think the right one should be "SYr14" instead of "SYr12" again
      • In the heading of Figure 2 an "in" is missing
      • There are some citations, that seem to be made with another citation style (superscript numbers) than numbers in brackets across the manuscript.

      Significance

      The most important aspect resulting from this manuscript is that ALC1 inhbitors could improve the response to some PARPi without damaging healthy cells. Thereby, the authors also mention the limitation of the use of ALC1 as a target and offer a potential biomarker for combinatory approaches. This study offers a very detailed insight into the potential role of ALC1 as a target for sensitization approaches under the different genetic conditions that can occur in HGSOC.

      These novel insights contribute to further broaden the therapeutic options by PARPi in clinical settings if the results can be approved by in vivo trials.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Lindsey et al. explores the role of ALCN1 (Amplified in Liver Cancer 1) loss in enhancing the sensitivity of PARPi in ovariar carcinomas, including BRCA1/2 mutated tumors (both sensitive and resistant to platinum) as well as cyclin E amplified settings.

      The data are interesting but the in some cases there is an overinterpretation of the results. I have listed below my major concerns

      Figure 1. Could the authors demonstrate that OVASAHO cells are BRC2 muted? Indeed, I have always though they were BRCA wt type (10.1016/j.ygyno.2015.08.017). While the data on cisplatin suggest that indeed ALC1 loss do not impact its sensitivity, I disagree with the statant that "the correlation between dispensability of ALC1 in platinum response suggests that this chromatin remodeler likely does not contribute to MMEJ (page 6)" or " is dispensable for HR (page 7). Indeed, it is has to be stressed that cisplatin induced DNA damage (interstrand crosslinks) are substrates also for nucleotide excision repair, that has a key role in repairing these lesions. Figure 2. Please explain better why niraparib is not active in cyclinE1-high cells. It is not clear to me if the authors consider a cyclin E "gain" an overexpressing tumor (i.e. OVCAR8). The authors need to show the response to PARPi in one (possibly two) cell lines with very low expression of cyclin E and knock-down of ALC1.<br /> The deletion of ALC1 do interfere with tumor take and tumor growth? No clear is the in vivo experiments. Injecting OVCAR8 cells in the peritoneum is not associated with the formation of ascites? How was tumor weight calculated? It seems that tumors grow as solid mass, but how were nodules<1mm quantified? Please clarify. Why survival curves were not shown? The dose of 50mgr/kg every third day is a very low olaparib dose. Generally the in vivo dosing is 100mgr/kg , 5 days a week for 4 weeks (doi: 10.1158/1535-7163.MCT-21-0420; 10.1158/2767-9764.CRC-22-0423).

      Figure 4. I could not find the data of the minimal impact of ALC1 in UWB1.289 cells. What the author refer to? They refer to the fact that ALC1 deletion di not cause any cell growth alteration or to something else? But were there the data? The modest increment in pRPA in hTER-FT282 is statistically significant and not very different from what observed in UWB.289, suggesting that ACL1 deletion could indeed impact normal cells. These data should be interpreted more conservatively.

      Figure 6. Questionable is the OS as endpoint in this heterogeneous patient population (treated in front line and recurrent) and in my opionion OS, much more than PFS, is influences by the many different treatment these patients underwent and that could influence the OS. Why not considering PFS after/or on PARPi treatment? The authors should clarify the patient population, Indeed, 48 patients were treated with PARPI and were platinum sensitive and possibly HRD. What patients are the HPR patients? How many were they? It is not clear the HRP and high replication stress cohort were treated with PARPi? How many of these were Cyclin E amplified or with high levels? Figure 6F should also include, beside UVB+BRCA1, other tumor cells with no Cyclin E overexpression and non BRCA mutation or HRD.

      The discussion of limitations should be addressed to strengthen the manuscript.

      Significance

      The manuscript by Lindsey et al. explores the role of ALCN1 (Amplified in Liver Cancer 1) loss in enhancing the sensitivity of PARPi in ovarian carcinomas, including BRCA1/2 mutated tumors (both sensitive and resistant to platinum) as well as cyclin E amplified settings. The data are interesting but the in some cases there is an overinterpretation of the results.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      ALC1 suppression has been shown to potentiate PARP inhibitor lethality in HR-deficient cells. Rather than revisiting the underlying mechanism, which has been characterized and remains an active area of investigation, this study aims to define the clinical contexts in which combined ALC1 and PARP inhibition may be beneficial. The clinical efficacy of PARP inhibitors, and their FDA approval, is largely restricted to HR-deficient tumors. This study dissects the combined effects of ALC1 and PARP suppression across a panel of HRD ovarian cancer cell lines, multiple classes of PARP inhibitor, and cells harboring distinct PARPi resistance mechanisms. In doing so, the authors delineate both the potential utility and the limitations of combined ALC1 and PARP inhibitor treatment in HRD ovarian cancers. The most impactful finding of the study, however, is likely the demonstration that ALC1 suppression sensitizes HR-proficient, CCNE1-amplified high-grade serous ovarian cancers to PARP inhibitors. These tumors are associated with particularly poor outcomes owing to the current absence of effective targeted therapies, making this observation of considerable clinical relevance. Of note, the study relies on genetic rather than pharmacological depletion of ALC1, a choice likely reflecting the current lack of a commercially available ALC1 inhibitor. While genetic suppression may not fully recapitulate the effects of combined drug treatment, it t offers the advantage of not being tied to any specific compound, allowing the authors to establish more general principles. I have only a few comments.

      The effect of ALC1 KO on PARPi sensitivity is less pronounced in OVSAHO cells (BRCA2-mutated) than in BRCA1-mutated cells. In these cells, it looks like there is an additive effect rather than synergy.

      1. The authors should calculate, if possible, whether there is synergy or additive effect of ALC1-KO lethality (BLISS).
      2. Another BRCA2-mutated cell line should be included.

      Minor comments:

      • Figure key is missing for S2C (I assume it's grey DMSO, blue olaparib)
      • Page 8: "BRCA1-mutant ovarian cancer cells eventually develop chemoresistance when exposed to PARPi for a prolonged period. Mechanistically, this is due to rewiring of ATR signaling, which enables RAD51 loading at DNA breaks and reversed forks independent of BRCA1 protein(25)." This sentence suggest this is the only existing resistance mechanism, which should be correct. Modify to "mechanistically, this CAN be due to", or "this is OFTEN due to".

      Significance

      ALC1 inhibitors have been developed and clinical trials are starting. The significance of this manuscript lies in establishing the clinical potential for combined ALC1-PARP inhibition in high grade serous ovarian cancer. Especially, the authors demonstrate that combined ALC1 suppression with PARP inhibition efficiently kills HR-proficient CCNE1-amplified ovarian cancers, which represent 20% of ovarian cancers and are resistant to current therapies.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity:

      In this paper, Tomasek and colleagues describe a series of experiments illuminating the effects of OM-89, a bacterial lysate taken orally for prevention of recurrent UTI, on intracellular dynamics of UPEC, using cell culture and organoid models. Suggestions for improvement and for clarification of the authors' conclusions and relevance to human UTI (and OM-89 use) are offered below.

      Major points:

      1. The data indicate that OM-89 exposure in the organoids enhances lysosomal degradation pathways and (in mBOs) autophagic flux, and the authors conclude this is a mechanism by which UPEC regrowth after antibiotic treatment (modeling rUTI) is inhibited by OM-89. They also show enhanced cellular uptake of fluorescently labeled antibiotics (ampicillin) in organoids - this leads them to conclude (and state in the paper's title) that increased intracellular antibiotic concentration effects increased killing of UPEC and decreased regrowth. These are two separate proposed mechanisms, and especially with regard to the antibiotics, they have not shown that increased intracellular antibiotic concentration actually kills intracellular UPEC in their model - only that regrowth as measured microscopically is less. In total, a mechanistic connection between the observed lysosomal effect and the intracellular antibiotic uptake, and which one is more important for UPEC control in this model, is incomplete. The precise wording of the paper's title should be reconsidered accordingly.

      We agree with the point raised by the reviewer that we did not show a mechanistic connection between the observed lysosomal effect and the intracellular antibiotic uptake. Further experiments dissecting the exact involved mechanistic pathways driving both - either in conjunction or separately - would improve our understanding on how OM-89 leads to its positive effects. In future studies we will focus on dissecting the underlying pathways and determining whether a mechanistic connection exists to explain the observed positive effects of OM-89 between lysosomal degradation and enhanced intracellular antibiotic accumulation.

      Accordingly, we changed the title to "Targeted lysosomal activation in bladder epithelium enhances clearance of intracellular uropathogenic ____Escherichia coli". This revised title avoids implying a direct causal link between increased intracellular antibiotic accumulation and bacterial clearance, while still reflecting the central biological process identified in our study.

      Additionally, we incorporated changes in the introduction, as highlighted in our reply to point number one raised by reviewer number two.

      OM-89 is taken orally for rUTI prevention, and some "components" reach the urinary tract (line 81). But it isn't explained how applying OM-89 directly to organoids models how its components may reach the bladder epithelium (from the basolateral side, if the OM-89 is applied outside the organoids) in the whole animal or human. At the least, this limitation should be stated in the Discussion.

      We thank the reviewer for pointing out this limitation. Although advanced in vitro models help to better mimic the in vivo situation, they still do not fully recapitulate all aspects of drug exposure and delivery observed in vivo. We included the following statement of limitation now in the discussion in line 449-459: "One limitation of our study is that OM-89 was applied directly to epithelial cultures and organoids, whereas in clinical use it is administered orally. Although pharmacokinetic studies have demonstrated systemic distribution and urinary accumulation of OM-89-derived components following oral administration (van Dijk, 1982), our experimental setup does not recapitulate the exact route, kinetics or concentration profiles encountered ____in vivo. Rather, our models were designed to determine whether bladder epithelial cells are capable of responding directly to OM-89-mediated signals and to identify the intracellular pathways involved. Given the documented systemic exposure following oral administration, direct effects on the urothelium are biologically plausible. However, future studies will be required to determine how the epithelial responses identified here integrate with the complex systemic and immune-mediated effects of OM-89 under physiological administration conditions."

      In the lysosome studies starting on line 319, the cultured cells are all infected (and either treated with OM-89 or not). What observations regarding number and size of vesicles, etc (all the measures in Fig 6) are evident when cells are treated with OM-89 only? These data should be presented (at least as a supplemental figure) to enable optimal interpretation of the OM-89+UPEC data in Fig 6. As the authors themselves indicate, OM-89 may be having a generalized effect on endocytic and/or autophagic flux by bladder epithelial cells, independent of infection.

      We thank the reviewer for this suggestion and agree that evaluating OM-89 treatment in the absence of infection provides important context for interpreting the infection-associated phenotypes shown in Figure 6. Our original intention was to focus the main manuscript on the effects of OM-89 during UPEC infection, and we therefore did not include the corresponding uninfected conditions.

      As part of the planned revision, we will include additional supplementary data examining the effects of OM-89 alone in both murine and human bladder epithelial cells. Specifically, we will present analyses of Lamp1-positive lysosomal vesicles, lysosomal acidification (LysoSensor), and Cathepsin L activity under uninfected conditions. These experiments will allow readers to assess the extent to which OM-89 activates epithelial lysosomal pathways independently of infection and will provide important context for interpreting the infection-associated responses presented in the main figures.

      We agree with the reviewer that OM-89 may exert broader effects on epithelial lysosomal pathways beyond the setting of infection, and inclusion of these data will strengthen the interpretation of OM-89 as a direct modulator of epithelial antimicrobial function.

      With the organoids, beyond the microscopic quantification of UPEC, can CFUs be measured?

      We understand the wish of the reviewer to see CFU measurements performed on organoids. However, this imposes strong technical limitations, mainly due to the tedious and technically challenging microinjections, e.g. the exact same amount of organoids would need to be infected by microinjections in both conditions (OM-89 and control) and injections would need to be performed extremely precise with no bacteria spreading into the surrounding extracellular matrix (frequently, organoids would get penetrated with the microneedle all the way, leading to bacteria being not injected into the lumen but rather into the wall of the organoid or even be released on the other side of the organoid) as otherwise also bacteria escaping into the extracellular matrix would be collected upon recovering the organoids from the extracellular matrix domes, strongly affecting the CFU measurements.

      However, using differentiated monolayers of mouse bladder epithelial cells and performing a classic gentamicin protection assay would add an additional layer of information on the purely intracellular bacterial population, whilst overcoming the previously mentioned technical challenges. Therefore, we aim to perform CFU measurements on monolayers with and without OM-89 treatment to support our microscopic quantification and specifically be able to make a statement on reduced intracellular bacterial burden with OM-89 treatment. The CFUs will therefore provide an orthogonal measure of intracellular bacterial burden and complement the microscopy-based quantification during the infection and antibiotic-treatment phases.

      Adding to this point of the reviewer, we wanted to clarify that with the higher-throughput microscopic quantification used in our approach (Thunder widefield microscope at 25x magnification), we cannot distinguish between strictly intracellular or tissue-associated bacteria, hence we used the wording "intra-organoid" in our methods section. We now added this information also into the results section for clarification (line 116): "Hence, the microscopy data represent the total "intra-organoid" bacterial burden at each experimental stage, without distinguishing the exact localization of the bacteria - which can be luminal, intracellular or tissue-associated.". To further reflect this, we stepped back from referring to antibiotic-mediated "killing", but changed the wording to antibiotic-mediated "clearance" or referred to reduced bacterial burden throughout the manuscript.

      __Minor points:____ __

      1. In Fig 1A, the "co-application" horizontal line is under the 7-10 hour window, but the text suggests that the application of antibiotics and OM-89 in this experiment is between 4-7 hours.

      We thank the reviewer for pointing this out. Indeed, in the co-application regime, OM-89 is added at the same timepoint as the antibiotic - meaning straight after monitoring the growth phase at 4h post-infection (pi). We now adapted the horizontal line for the "co-application" treatment in Figure 1A accordingly to represent the time-point of OM-89 addition better. Additionally, we added a line for the antibiotic-treatment in order to further facilitate readability.

      How are antibiotics and OM-89 "removed" at the 7-hour mark? This was not detailed in the Methods.

      Although we had specified this in the methods section at line 603 "For every media exchange (e.g. antibiotic treatment or withdrawal), each well was washed with 9 ml of the respective media before leaving 1 ml in the well.", we realized the positioning was not optimal as we had mentioned this part under the point "Bacterial injection" in "Injection experiments". We therefore now separated this part, together with the lid preparation, from the "Bacterial injection" part and created the new subsection "Lid preparation for media changes" (line 613 onwards).

      What time point was used for the transcriptomic profiling of organoids? This is not clear from the relevant Methods or Results sections.

      As stated in the methods section, RNA for transcriptomic profiling from mBOs was extracted at 4h post-infection (pi) (line 842).

      In showing that OM-89 "attenuated" the magnitude of inflammatory responses (Fig 2C and S3B), it would be helpful to add a panel showing the comparison of OM89+UPEC to PBS alone - this would be expected to convey activity (red) in the infection-related pathways, but to a lower magnitude than seen in UPEC vs PBS.

      We thank the reviewer for this suggestion, as well as comment number 5 below. We comment more on both suggestions below.

      Similarly, in the results outlined starting on line 196, it would be helpful to add a panel showing OM89+UPEC vs OM89 alone.

      We performed the requested, combined GOBP analyses and they confirm that infection-associated pathways remain strongly activated in OM89-treated infected organoids relative to baseline (PBS) controls and relative to OM89-treated uninfected organoids. These results confirm the reviewer's hypotheses and further confirm the results presented in Figure 2C. In fact, induction of genes involved in detrimental effects of UPEC infections are induced at a lower extent when organoids are exposed to OM-89 only.

      However, because the direct comparison between OM89+UPEC and PBS+UPEC already highlights the effect of OM-89 while controlling for the infection status, we believe our original analysis presented in Figure 2C remains the most informative representation of attenuation. Therefore, we will include the new comparison in the supplementary section of the manuscript.

      In line 236, what is meant by lysosomal "activation"? A more specific term should be chosen here.

      We thank the reviewer for this question and aim to increase readability of this section. With lysosomal activation in the first sentence of the mentioned paragraph, we referred to the observed effect of upregulated lysosomal pathways and altered lysosomal vesicles in the previous paragraph. However, to make the connection to the previous paragraph better, and given the comment number two of reviewer number two, we changed the whole first paragraph of this section. Therefore, the first sentence of this paragraph (line 235 onwards) reads now: "To test whether the observed effects on lysosomal pathways could mechanistically, at least in parts, explain OM-89-mediated protection, we first used Genebridge analysis (Li et al, 2019) to examine how the lysosomal gene signature identified in our RNA-seq data relates to host defense programs in the human bladder."

      In the Abstract (line 25), the phrase "Using bladder organoids..." is a dangling modifier.

      We thank the reviewer for pointing this out and changed the sentence accordingly to "In bladder organoids and differentiated epithelial monolayers, OM-89 promotes lysosomal acidification and increases lysosomal protease activity, driving intracellular UPEC toward degradative compartments."

      Typographical and copyediting:

      We thank the reviewer for pointing out the typographical errors below and we corrected them all.

      1. Line 74 should read "For instance..."

      2. Line 76 should read "when combined with antibiotic therapy..."

      As this sentence is to emphasize the already observed protective effects of OM-89, and the two studies mentioned were either performed without or in combination with antibiotics, we changed the sentence to "For instance, rodent infection studies have demonstrated protective effects of OM-89 alone (Bosch et al, 1988; Lee et al, 2006) and in combination with antibiotic therapy (Canton et al, 2025; Bessler et al, 2010), although this observed in vivo protection could not be linked to any major quantitative changes in bladder immune cell infiltration (Canton et al, 2025), leaving the underlying molecular mechanism not fully resolved." for better readability.

      Line 122 should read "...regrowth following antibiotic treatment" or "regrowth post-antibiotic treatment"

      Line 138 should use "regimen" not "regime"

      Line 196 delete comma after "Although"

      Line 244 fully hyphenate "OM-89-mediated"

      Line 374 should read "...significantly enhance antibiotic-mediated killing"

      • *

      __Significance:____ __

      The paper is very well written and though a lot of data are included, the presentation is excellent and helps the reader to follow the story. The paper makes a strong contribution to the UTI pathogenesis field, and the use of mouse and human bladder organoids is innovative in studying intracellular UPEC. My scientific expertise as a reviewer is in UPEC pathogenesis, directly relevant to the content of this paper.


      Reviewer #2


      Evidence, reproducibility and clarity:

      This study examined the effect of OM-89 on UPEC infection, antibiotic clearance, and resurgence in mouse and human organoid models. The goal of the study was to understand the molecular mechanisms by which OM-89 is effective at preventing rUTI in patients.

      Major comments:

      The manuscript is well-written and the figures are well presented. Adequate background information is provided to give the study context and sufficient experimental details are provided to allow replication by other groups. Experiments contain appropriate controls and sufficient replicates to allow appropriate statistical analyses. The authors are careful to acknowledge the differences they observed between the mouse and human system and provide satisfactory potential explanations for these differences. The conclusions they draw are well supported by their data and none of their claims from their data are overstatements. Below are some, which I believe if addressed could improve the paper.

      1. I think the authors overstate the novelty of the concept that the urothelium is an active targetable determinant of infection and treatment outcomes. This is not an entirely new concept since previous studies have examined antimicrobial peptides and other factors from the urothelium.

      We thank the reviewer for this important point and agree that the urothelium has long been recognized as an active participant in host defense through mechanisms such as antimicrobial peptide production, pathogen sensing and regulation of inflammatory responses. We have therefore revised the manuscript to avoid implying that urothelial involvement in infection outcome is itself a novel concept. Instead, we now emphasize the specific advance of our study: the identification of lysosome-centered epithelial activation as a therapeutically targetable mechanism that enhances intracellular bacterial clearance and potentiates antibiotic efficacy.

      In the abstract we changed: "Our findings position the bladder epithelium from a passive barrier to an active, targetable determinant of treatment outcome and suggest host-directed modulation of epithelial antimicrobial pathways as a promising strategy to enhance intracellular bacterial clearance." to "Our findings demonstrate that bladder epithelial antimicrobial pathways can be pharmacologically reinforced to influence treatment outcomes by enhancing intracellular bacterial clearance." in line 30.

      In the introduction we changed: "Together with increased intracellular accumulation of antibiotics across different classes, this leads to improved intracellular killing and reduced bacterial regrowth across diverse UPEC strains." to "Together with increased intracellular accumulation of antibiotics across different classes, this leads to improved intracellular clearance and reduced bacterial regrowth across diverse UPEC strains." in line 91 and "Together, these findings reveal a previously unrecognized epithelial lysosome-centered mechanism by which OM-89 enhances intracellular antibiotic performance and repositions the bladder epithelium from a passive reservoir of infection reactivation to an actively transformable antimicrobial compartment influencing treatment outcomes." to "Together, these findings reveal a previously unrecognized epithelial-centered mechanism by which OM-89 enhances intracellular antibiotic performance and establishes lysosomal activation as a therapeutically targetable component of epithelial host defense against intracellular UPEC." in line 96.

      In the discussion we changed: "Together, these findings provide a mechanistic framework for the long-observed clinical efficacy of OM-89. Our findings reveal that the urothelium itself can be therapeutically targeted to reduce pathogen regrowth by transforming the epithelial barrier from a passive refuge for UPEC into an active defense site." to "Together, these findings provide a mechanistic framework for the long-observed clinical efficacy of OM-89 and identify epithelial lysosomal pathways as a therapeutically targetable component of host defense that can be used to improve intracellular bacterial clearance." in line 398 and "In the face of rising antimicrobial resistance (2024), strengthening epithelial antimicrobial function offers a complementary route to shift the bladder mucosa from a passive niche of bacterial survival and infection reactivation toward an active site of accelerated pathogen clearance." to "In the face of rising antimicrobial resistance (2024), our findings provide a mechanistic rationale for the clinical use of OM-89 and support epithelial lysosomal pathways as a promising target for host-directed therapeutic strategies that enhance intracellular bacterial clearance and improve the efficacy of existing antibiotics." in line 469.

      Depending on the target audience, the Module-Module association analysis could need more introduction. I am not a computational biologist and it was not obviously apparent how Figure 4A is generated and what it actually showing. How specifically does this analysis demonstrate a functional link between lysosomal activity and immune defense pathways? Without further explanation, it is my opinion that this figure panel is an unnecessary distraction that is not required for any of the conclusions that the group can already draw from the rest of their data.

      We thank the reviewer for this constructive critique. We agree that the rationale and interpretation of this analysis were not sufficiently explained in the original manuscript. We have therefore expanded the description of the MMAS approach and clarified how these data support the translational relevance of the lysosomal pathways identified in our experimental models. We also agree that for a broader biological audience, the computational framework and the strategic necessity of Figure 4A required a clearer introduction and stronger justification.

      To address the reviewer's concerns, we have thoroughly revised the text (lines 235-250) to clarify the methodology and emphasize the essential translational value this analysis adds to our study:

      • How the figure is generated and what it shows: We have added explicit language clarifying that we used a computational Module-Module Association Score (MMAS) to evaluate the transcriptional correlation between the lysosomal gene network and functional biological pathways. Rather than relying on a single experimental dataset, this analysis compiles data across eight independent human bladder transcriptomic datasets encompassing over 1,400 clinical samples.
      • Demonstrating the link to immune pathways: We have explicitly named the specific host defense modules highlighted in Figure 4A, namely "Response to molecule of bacterial origin", "cell activation involved in immune response", and "innate immune response" to guide the reader directly to the strong positive correlations shown in the panel.
      • Justifying its inclusion (mouse-to-human translational bridge): While the rest of our data characterizes the cellular mechanics of OM-89 in murine organoids and cell culture, Figure 4A demonstrates that the link between lysosomal activity and bacterial defense is a conserved feature of bladder tissue biology across species. This cross-species alignment (our mouse-data at this stage of the manuscript compared to human-derived data) provides critical clinical justification for targeting epithelial lysosomal pathways as a therapeutic strategy in human patients. The new paragraph reads as follows: "To test whether the observed effects on lysosomal pathways could mechanistically, at least in parts, explain OM-89-mediated protection, we first used Genebridge analysis (Li et al., 2019) to examine how the lysosomal gene signature identified in our RNA-seq data relates to host defense programs in the human bladder. To evaluate the translational relevance of our experimental findings, we used a computational Module-Module Association Score (MMAS) analysis across eight independent human bladder transcriptomic datasets comprising over 1,400 clinical samples. This network-based approach evaluates the transcriptional correlation between the lysosomal gene network and functional biological pathways across diverse human cohorts. Module-Module association analysis performed on these human bladder datasets indicated that the lysosome module has strong positive associations with specific host defense modules, including "response to molecule of bacterial origin", "cell activation involved in immune response", and "innate immune response" (Figure 4A), highlighting a conserved functional link between lysosomal activity and immune defense pathways in the bladder epithelium. Altogether, these positive correlations suggest that lysosomal activation represents a conserved pathway integrated within mucosal immunity across species, rather than an isolated cellular response unique to our experimental models."

      __Significance:____ __

      General assessment: Solid experimental design with appropriate controls. Appropriate statistical rigor. Conclusions justified by the data. Limitations acknowledged. Differences in results between mice and humans acknowledged.

      Advance: Moderate technical advance building on prior organoid models. Significant mechanistic advance because OM-89 has been widely used for a long time without detailed understanding of why it works. Moderate conceptual advance that urothelial cells are a targetable determinant of treatment outcomes.

      Audience: I am a basic science researcher in the field of female urogenital tract microbiome and infections. Other researchers studying UTI will certainly be interested in this study. It also may be of interest to people studying other bladder conditions that involve the urothelium (bladder cancer).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study examined the effect of OM-89 on UPEC infection, antibiotic clearance, and resurgence in mouse and human organoid models. The goal of the study was to understand the molecular mechanisms by which OM-89 is effective at preventing rUTI in patients.

      Major comments:

      The manuscript is well-written and the figures are well presented. Adequate background information is provided to give the study context and sufficient experimental details are provided to allow replication by other groups. Experiments contain appropriate controls and sufficient replicates to allow appropriate statistical analyses. The authors are careful to acknowledge the differences they observed between the mouse and human system and provide satisfactory potential explanations for these differences. The conclusions they draw are well supported by their data and none of their claims from their data are overstatements. Below are some, which I believe if addressed could improve the paper.

      1. I think the authors overstate the novelty of the concept that the urothelium is an active targetable determinant of infection and treatment outcomes. This is not an entirely new concept since previous studies have examined antimicrobial peptides and other factors from the urothelium.
      2. Depending on the target audience, the Module-Module association analysis could need more introduction. I am not a computational biologist and it was not obviously apparent how Figure 4A is generated and what it actually showing. How specifically does this analysis demonstrate a functional link between lysosomal activity adn immune defense pathways? Without further explanation, it is my opinion that this figure panel is an unnecessary distraction that is not required for any of the conclusions that the group can already draw from the rest of their data.

      Significance

      General assessment: The manuscript has several methodological strengths. These include the use of both mouse and human urothelial models, inclusion of appropriate controls, and sufficient replicates to ensure reproducibility. The statistical methods employed were appropriate. No major methodological weaknesses were identified. The descriptions of methods provide sufficient experimental details to allow the experiments to be reproduced by other labs. The authors did a nice job interpreting their data in light of previous literature. They did not overstate the magnitude or significance of their findings and were careful to acknowledge the limitations in their study design.

      Advance: Moderate technical advance building on prior organoid models. Significant mechanistic advance because OM-89 has been widely used for a long time without detailed understanding of why it works. Moderate conceptual advance that urothelial cells are a targetable determinant of treatment outcomes.

      Audience: I am a basic science researcher in the field of female urogenital tract microbiome and infections. Other researchers studying UTI will certainly be interested in this study. It also may be of interest to people studying other bladder conditions that involve the urothelium (bladder cancer).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, Tomasek and colleagues describe a series of experiments illuminating the effects of OM-89, a bacterial lysate taken orally for prevention of recurrent UTI, on intracellular dynamics of UPEC, using cell culture and organoid models. Suggestions for improvement and for clarification of the authors' conclusions and relevance to human UTI (and OM-89 use) are offered below.

      Major points:

      1. The data indicate that OM-89 exposure in the organoids enhances lysosomal degradation pathways and (in mBOs) autophagic flux, and the authors conclude this is a mechanism by which UPEC regrowth after antibiotic treatment (modeling rUTI) is inhibited by OM-89. They also show enhanced cellular uptake of fluorescently labeled antibiotics (ampicillin) in organoids - this leads them to conclude (and state in the paper's title) that increased intracellular antibiotic concentration effects increased killing of UPEC and decreased regrowth. These are two separate proposed mechanisms, and especially with regard to the antibiotics, they have not shown that increased intracellular antibiotic concentration actually kills intracellular UPEC in their model - only that regrowth as measured microscopically is less. In total, a mechanistic connection between the observed lysosomal effect and the intracellular antibiotic uptake, and which one is more important for UPEC control in this model, is incomplete. The precise wording of the paper's title should be reconsidered accordingly.
      2. OM-89 is taken orally for rUTI prevention, and some "components" reach the urinary tract (line 81). But it isn't explained how applying OM-89 directly to organoids models how its components may reach the bladder epithelium (from the basolateral side, if the OM-89 is applied outside the organoids) in the whole animal or human. At the least, this limitation should be stated in the Discussion.
      3. In the lysosome studies starting on line 319, the cultured cells are all infected (and either treated with OM-89 or not). What observations regarding number and size of vesicles, etc (all the measures in Fig 6) are evident when cells are treated with OM-89 only? These data should be presented (at least as a supplemental figure) to enable optimal interpretation of the OM-89+UPEC data in Fig 6. As the authors themselves indicate, OM-89 may be having a generalized effect on endocytic and/or autophagic flux by bladder epithelial cells, independent of infection.
      4. With the organoids, beyond the microscopic quantification of UPEC, can CFUs be measured?

      Minor points:

      1. In Fig 1A, the "co-application" horizontal line is under the 7-10 hour window, but the text suggests that the application of antibiotics and OM-89 in this experiment is between 4-7 hours.
      2. How are antibiotics and OM-89 "removed" at the 7-hour mark? This was not detailed in the Methods.
      3. What time point was used for the transcriptomic profiling of organoids? This is not clear from the relevant Methods or Results sections.
      4. In showing that OM-89 "attenuated" the magnitude of inflammatory responses (Fig 2C and S3B), it would be helpful to add a panel showing the comparison of OM89+UPEC to PBS alone - this would be expected to convey activity (red) in the infection-related pathways, but to a lower magnitude than seen in UPEC vs PBS.
      5. Similarly, in the results outlined starting on line 196, it would be helpful to add a panel showing OM89+UPEC vs OM89 alone.
      6. In line 236, what is meant by lysosomal "activation"? A more specific term should be chosen here.
      7. In the Abstract (line 25), the phrase "Using bladder organoids..." is a dangling modifier.

      Typographical and copyediting:

      1. Line 74 should read "For instance..."
      2. Line 76 should read "when combined with antibiotic therapy..."
      3. Line 122 should read "...regrowth following antibiotic treatment" or "regrowth post-antibiotic treatment"
      4. Line 138 should use "regimen" not "regime"
      5. Line 196 delete comma after "Although"
      6. Line 244 fully hyphenate "OM-89-mediated"
      7. Line 374 should read "...significantly enhance antibiotic-mediated killing"

      Significance

      The paper is very well written and though a lot of data are included, the presentation is excellent and helps the reader to follow the story. The paper makes a strong contribution to the UTI pathogenesis field, and the use of mouse and human bladder organoids is innovative in studying intracellular UPEC. My scientific expertise as a reviewer is in UPEC pathogenesis, directly relevant to the content of this paper.

  2. Jun 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility, and clarity

      Summary: Edvalson and colleagues use transcriptomics, cell biology and genetics to study variation between segregation distorter (meiotic drive) strains and find several important results. These include apparent suppression of small RNAs mapping to responder (the drive target) in one of the lines, a general pattern of differential expression consistent with the drive mechanism being upstream of sperm individualization (where defects have been seen previously), and genetic confirmation that perturbing Rsp expression can influence the strength of drive.

      Major comments: I found the total RNA sequencing experiment a bit oddly presented. This is partly because it was in the middle of the results (might fit better first), partly because few specific genes were discussed (this might be appropriate given then question, but maybe the question should be more clearly stated), and the complexity of the approach (WCGNA + PANGEA) and how it all fits together. I suggest working to clarify the main points of this section (which are a bit different than the main focus of the Rsp work).

      We thank the reviewer for these important points. We liked the suggestion to swap the order of our results. We attempted the change, but we found that we weren't able to make the flow of the results much better. Instead, we primed the transition from smRNA to totRNA in the last paragraph of the smRNA results (lines 190-196). This paragraph now reads:

      The dearth of Rsp smRNAs in SD-Mad heterozygotes could be due to a disruption in transcription of the locus or subsequent processing steps. Many factors can influence piRNA production. For example, the piRNA pathway can amplify piRNAs independently of transcription, such as the ping pong cycle, (Czech and Hannon 2016). Notably, Rsp piRNAs do not have a strong ping pong signature in testes (Wei et al. 2021; Chen et al. 2021a). To distinguish between a disruption in transcription or some downstream process, we examined total RNA.

      The main reason we elected to describe patterns rather than specific genes is that the 2nd chromosomes we tested (R-16, SD-Mad, SD-5) have all diverged from each other and any single differentially expressed gene could be due to differences in genetic background. Therefore, we elected to point out more broad systematic changes in pathways and correlated gene networks rather than specific genes. We have made it more obvious throughout the total RNA section in the text what our question is regarding the transcriptome and the reasoning for using WGCNA and gene set analysis.

      We also appreciate the reviewers point that the complex approach we used to extract changes in pathways and networks is difficult to follow. We have modified our wording to better describe the flow of analyses.

      We also note that we have extended our analysis for the comparison of SD-Mad and SD-MadRev, which only differ by the Sd-RanGAP locus. Here we do discuss individual genes that are differentially expressed. See below for details about this new analysis.

      Minor comments:

      Abstract - Probably worth mentioning Sd-RanGAP here, even if you are using it as a straw man. I agree that the specific mechanism is not known, but some of the genetics are established.

      This is a good point. While our study doesn't address RanGAP, it is important to point out that, although its role in drive is unclear, Sd-RanGAP is a necessary component of the system. We added the following language to the abstract:

      SD is a multigene complex, frequently associated with chromosomal inversions, where the main driver locus, a truncated duplication of the gene RanGAP kills wild-type sperm containing a satellite DNA called Responder (Rsp).

      Line 80 and elsewhere - it would be helpful to be specific here - you are looking at both small and total RNA

      We've modified our wording throughout the manuscript to specify when we are referring to total RNA and small RNA.

      Fig 1B - is there a reason not to show the values of the replicates here? It would be more transparent.

      We thank the reviewer for this comment. We replaced Fig 1B with a chart that is computed from the DESeq2 normalized counts for each comparison and added replicates to all related graphs.

      Line 139 - does the experimental design control for 1.688 genomic copy number? Where is it located?

      We indeed control for the 1.688 copy number here. Most 1.688 repeats are found on the X chromosome and all flies in our experiments have identical X chromosomes. We changed the text to specify that copy number for 1.688 are the same between conditions.

      144-146 - this could be written clearer, and I think it should only refer to 1C, not 1B. Part of the issue is that there are several repeats not discussed, and it isn't clear what is happening with them. I suggest expanding this description so it is more clear.

      Thank you for this feedback. We have expanded the description to make this section clearer.

      Line 161 - what do you mean (specifically) by "repetitive loci"?

      Repetitive loci in this case refers to transposons, satellite DNAs (except simple satellites), and piRNA clusters. We have added text explaining what is included the grouping of "repetitive loci". We have added the following sentence to the text:

      Our results demonstrate that SD-Mad and SD-5 haplotypes, despite sharing the same main drive locus, have different effects on smRNAs derived from repetitive loci such as complex satellites (including Rsp), transposable elements, and piRNA clusters.

      193-203 - This is an important finding that is somewhat lost in trying to keep track of WCGNA and PANGEA and the different Modules. I suggest clarifying to drive home the point that differential expression appears to start prior to individualization, which suggests and earlier mechanism of drive.

      We thank the reviewer for this feedback. We have added wording to out discussion that points out this finding in lines 501-505 which reads:

      We suspect that the timing of the proximal cause of SD-mediated drive may align with early spermatogenetic processes; perhaps where cell cycle-related genes are active and appear to be broadly differentially expressed (Figure 2B, Module H). This earlier timing is consistent with temperature shift experiments that place the critical period for SD at or before meiosis (Mange 1968).

      Fig 3B & 3C, Fig 4 - same as 1B, is there a reason not to show the actual data points?

      A similar issue was brought up earlier, in response we modified all our figures to show replicate points where applicable.

      Line ~245 - was the same experiment done with SD-5? (as you do below for Rsp overexpression)

      We originally did not include SD5 in this experiment, but we have since measured drive strength of SD5 in a kipfKO background. We found a small but statistically significant difference in drive strength. We added the new SD5 results to the figure and moved the kipfKD data to the supplement along with some added data on a Rsp deletion line generated from Iso1 that bolsters our confidence in the SDMad results.

      Significance

      This is a strong paper that moves the field forward, even if it leaves questions still to be answered (why the difference between drivers? what is the mechanism? how is rsp interacting with drive?

      Several findings move the field forward: the Rsp small RNA results, the differential expression hinting at a molecular mechanism that is upstream of sperm individualization.

      The audience is moderately broad. Genetic conflict is gaining in general interest, but aspects of this will be mostly interesting to the hardcore drive crowd.

      Reviewer #2

      Evidence, reproducibility and clarity

      I have only one request: I found it unclear whether the authors were referring to small RNAs or their precursor (long RNA). By reading the text carefully, I could deduce that Fig1A/Table S2 represent the small RNA sequencing, while FigS3A represents total RNA seq (detecting precursor). However, the labeling in the Fig1A and Table S2 only says 'piRNA cluster' or 'Rsp' (without clarifying 'piRNA from piRNA cluster' or 'piRNA from Rsp'), and it took quite some time for me to understand which Fig/data is smallRNA vs. longRNA.

      This is helpful feedback. We have added more clarity to which type of RNA is being represented in our figures throughout.

      Significance

      This manuscript by Edvalson et al. describes their study on SD (segregation distorter) meiotic drive system, examining the role of piRNA derived from Rsp satellite. Although the exact mechanism of drive is still unknown, this study represents a significant step forward in understanding SD-mediated drive.

      By using two SD alleles (SD-5 and SD-Mad), they show that Rsp-derived piRNA is depleted in SD-Mad. The authors used total RNA sequencing/small RNA sequencing mutants and carefully designed controls (such as deletion of Sd-RanGAP) to reach the model that Rsp-derived piRNA is involved in SD-Mad-mediated drive. The result that kipferl depletion (that lead to sat DNA expression) rescues SD-Mad's drive phenotype is very interesting. This supports that the decreased Rsp piRNA indeed corresponds to SD-Mad-mediated drive. They further back up this idea by overexpressing Rsp.

      Interestingly, SD-5 was not impacted by changes in Rsp expression. Based on this result, the authors state that there are mechanistic variations in the same (SD) drive system. This statement is certainly justified by the data, but I cannot help wondering there might be a unifying mechanism that explains both SD-5 and SD-Mad. I am not suggesting to edit the manuscript or add the discussion: but do they have any speculations on this? For example, SD-5 is simply epistatic to Rsp piRNA production? For example, SD-RanGAP > SD-Mad (some gene on SD-Mad inversion) > Rsp piRNA production > SD-5 > sperm killing?

      We thank the reviewer for this insight. We indeed think that the proximal cause of sperm dysfunction could be the same, but there are components of SD5 that act downstream of Rsp piRNAs. The small difference in drive strength in the SD5 KipfKO experiments might support this hypothesis, although it is also possible instead that drive is influenced by changes in some other piRNAs (from the piRNA clusters or satellites).

      We modified our wording in the first paragraph of the discussion to point out this possibility. Lines 367-370 now reads

      These results suggest that, while SD chromosomes share a target and main drive locus (Sd-RanGAP), the modifiers accumulated on each haplotype may influence the drive mechanisms, either by creating new pathways to drive or acting as tuning knobs on drive strength.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary

      In the presented manuscript Edvalson and Wei et al use Drosophila genetics and NGS experiments to investigate the mechanism of meiotic drive through the Segregation Distorter (SD) system. They reveal that two driving haplotypes seem to function via different mechanisms, with drive through SD-Mad but not SD-5 involving small RNAs produced from the Responder (Rsp) satellite, the target of SD drive. SD-Mad testes displaying drive are characterized by lower levels of Rsp sRNAs compared to non-drive controls as well as SD-5, and the ectopic overexpression of Rsp sRNAs through two distinct mechanisms decrease drive in SD-Mad genetic background, specifically. With this work, the authors are adding an important piece of information to the highly complex SD system, indicating that sperm killing is likely achieved by different mechanisms in different SD haplotypes, despite sharing a common driver.

      Major comments

      Fig1C: It might be interesting to show the fold change between SD-Mad and SD-MadRev in addition to what is displayed. Moreover, can the authors comment on what might be causing the increased smRNA counts for 38C2? Is this because R16 has particularly low 38C2 values?

      We appreciate the reviewer's comment concerning the fold change between SD-Mad and SD-MadRev. We have made a figure showing the difference between and put it in Figure S1.

      We suspect that the expression difference in 38C2 between the R16 heterozygotes and SD heterozygotes may be due to genetic divergence, since these are different 2nd chromosomes. We have added language pointing this out to the manuscript in line 182. The paper now reads:

      *There is no evidence that either 38C2 or Flamenco are involved in SD-mediated drive. *

      Fig1/S1: Could the authors also display the Rsp smRNA counts for all Gla crosses similar to panel 1B? What is the interpretation for the increase in Rsp smRNAs in SD-5/Gla relative to R16/Gla but the lack of such an increase in the SD-5/iso1 vs R16/iso1 comparison? Do SD-Mad and SD-5 induce the same strength of drive against each of the two wildtype chromosomes? Experiments: smRNAseq for SD-MadRev/Gla.

      We have added a plot to Fig S1 to show the abundance of Rsp small RNAs in the Gla background, similar to Figure 1.

      It is difficult to interpret the apparent overabundance of Rsp small RNAs in the SD-5/Gla background. Because differences in Rsp smRNA abundance for SD-5 are inconsistent between the Iso1 and Gla background, our interpretation is that SD-5 is not manipulating Rsp levels. The apparent overabundance of Rsp in the Gla background could be due to an epistatic interaction between Rsp and other components of that particular background. Consistent with this interpretation, the SD-Mad induced reduction of Rsp smRNAs in the Gla background is less dramatic than in the Iso1 suggesting that something about that background is increasing Rsp expression slightly when paired with an SD chromosome.

      Fig1: The authors note changes in smRNA levels for other satellites as well as piRNA clusters but do not give any interpretation to this observation. Are they meaningful? Should they be attributed to genetic background?

      Our interpretation of the observation that some satellites or piRNA clusters are differentially expressed is that these differences are likely due to epistatic effects from the different 2nd chromosomes used in the study or are incidental to mechanism of SD.

      FigS2: Same question also for the deregulated TEs: do they share sequence features with Rsp or are they overrepresented in the clusters that change? Are these explained by differences in insertions between genotypes? Do their total RNAseq values change in any way? What do the percentages in line 162 correspond to? Number of TEs that are deregulated? At which cutoff? It might be informative to compare the data to a cross between driver and R16, or even better the SD-MadRev control. Experiments: totRNAseq for SD-MadRev crosses and optionally crosses to R16.

      The Rsp repeat unit does not share significant homology to portions of the genome outside of the pericentromere of 2R with the exception of ~6-12 copies in the intron of Ago3.

      As far as TEs are concerned, we surprisingly don't see a strong correlation between piRNA cluster content, dysregulation, and TE transcript abundance. For example, in the SD/Gla backgrounds the total RNA for R1, R2, IGS, and Tc1-Mariner family TEs is down regulated. However, the only major piRNA cluster that is upregulated in both SD/Gla backgrounds (80F) is not enriched for TE fragments matching any of those 4 families. One thing we can note is that the definition of the major piRNA clusters are given in relation to the Iso1 genome which may differ from that of our experimental backgrounds. Without long read resolved genomes for our specific experimental lines generated at the same time as the RNA samples it is difficult to determine how expression at the major piRNA clusters and the corresponding TEs are related. We have described this lack of a correlation in lines 210-217 in the text along with our interpretation for why this could be. The paper now reads:

      On the other hand, we did find some differences in repetitive elements related to rDNA (R1, R2, and IGS) and Tc1-Mariner family TEs (all backgrounds; Figure S6). Interestingly, there was no correlation between the expression of TEs and the expression of piRNA clusters that contain fragments of these TEs in the total RNA, nor was there any correlation between the small RNAs from piRNA clusters and the total RNAs for those TEs. PiRNA clusters are usually defined in one isolate of Iso1: rapid turnover of TEs and piRNA sources could explain why we do not see a correlation between piRNA cluster expression and TE expression in our backgrounds.

      We investigated differences in TE and piRNA cluster expression in our SD-Mad/Iso1 vs SD-MadRev/Iso1 comparison, but a lack of power due to inter-sample variation prevents us from confidently making any assessments on any TEs or piRNA clusters in that comparison. We did however generate additional gene level transcriptomic data using 3' Digital Gene Expression to bolster our confidence in the totRNA data and found some interesting genes that were in the top most differentially expressed. We have noted those genes in lines 276-287 which read:

      To identify genes that might interact to cause drive, we compared the gene expression of SD-Mad/Iso1 to SD-MadRev/Iso1. These genotypes only differ by the presence of the main drive locus, Sd-RanGAP. We performed both totRNA and 3' Digital Gene Expression (DGE) RNA sequencing and examined the overlap in differential expression between the totRNA and DGE sequencing. There are 69 differentially expressed genes where the DGE comparison is significant (PDGE {less than or equal to} 0.01), and the sign of the Log2FC of the totRNA matches that of the DGE. Among this set of differentially expressed genes, 57 show at least a 50% difference in gene expression (absolute Log2FC value of at least 0.58 in DGE). These genes are not enriched in any Reactome gene sets. The top 20 most differentially expressed genes consists of 9 lncRNAs (3 anti- sense RNAs) and 11 protein coding genes: 8 of which are uncharacterized. The 3 characterized genes are Artemis (Arts), Gr61a, and Tono (Figure S98, Supplemental File 1).

      We discuss two of these genes in further detail in the discussion in lines 476-486 which read:

      First, Tono, a BTB zinc finger-containing transcription factor is upregulated (Log2FCDGE = 1.7) in all SD-Mad comparisons. Tono plays a role in regulating transcription in muscle cells in response to mechanical pressure (Zhang et al. 2024) but also shows enrichment in male germ cells (Li et al. 2022). The putative DNA-binding capacity and ability to form nuclear condensates (Zhang et al. 2024) makes this an interesting candidate gene for interacting with the Rsp satellite. Second, the importin-4 ortholog, Artemis (Arts), which facilitates Ran-mediated import of H3 and H4 is overexpressed in SD-Mad (Log2FCDGE = 2.5). Interestingly, Arts expression is antagonistic to male fertility (VanKuren and Long 2018). Also of note, Apollo, a duplicate of Arts which supports male fertility (VanKuren and Long 2018) is downregulated (Log2FCDGE = -0.6) though it is not in the top-most differentially expressed genes.

      Figure S3: Am I reading the PCA plots right in that there are very few gene expression changes when the drivers are in iso1 background but much more in the Gla background? Comment on possible explanations for that. Please indicate the number of significantly changed genes in each comparison. Again, are these changes correlated between the two drivers or can they be attributed to genetic background of Gla vs R16? Would it be interesting to see how SD-Mad/Gla and SD-5/Gla gene expression profiles compare? Experiment: totRNAseq for SD-MadRev crosses.

      There did tend to be more differences in the Gla background compared to Iso1. This difference can best be explained by inter-sample variation in the SD-Mad/Iso1 background which we see in the PCA plot in Fig S4A. Another reason for the difference could be that the Gla and Iso1 chromosomes are very different from each other which prevents us from making any 1-to-1 comparisons between the SD/Iso1 and SD/Gla backgrounds. We generally avoid comparing between genetic backgrounds for this reason unless they share differences as these are more likely related to drive.

      In Figure S5A it seems that totalRNA levels of Rsp are strongly increased in SD-Mad/Gla but not in SD-Mad/iso1. The iso comparison (less piRNAs but same transcript) could indicate that it is actually transcription of the Rsp that is affected here. This is even pointed out in line 205 without discussion of the fact that the Gla comparison (less piRNAs but more transcript) would rather indicate that transcription is intact, but processing into piRNAs is defective. Could this be clarified using FISH as in Figure S8? If true, SD-Mad/Gla should have much more FISH signal than SD-Mad/iso1. Either way, this discrepancy should be further discussed. Experiments: comprehensive smFISH panel for all crosses (including SD-MadRev).

      The reviewer makes an excellent point. Why would Rsp long RNAs be overexpressed in the SD-Mad/Gla background? Earlier we noted that in the Gla background specifically the genotypes that contain an SD chromosome seem to have a higher level of Rsp small RNAs than we might expect given our Iso1 results. We conclude that this is likely due to an epistatic interaction between the 2nd chromosomes used in the study and the rest of the chromosomes. This interpretation could extend to the long noncoding precursors as well.

      Further, although the difference between SD-Mad/Gla is significant and SD-5/Gla is not, they do move in the same direction. This is also true in the Iso1 backgrounds but in the opposite direction. Given an interpretation that Rsp expression is higher than expected in the SD/Gla background due to epistatic effects, it becomes clearer that changes in long RNA abundance are related to changes in small RNA abundance though not perfectly indicative. However, due to lower count levels for Rsp in the totRNA, we do not have the power to confidently draw that conclusion.

      In general, the totRNA profiles of repeats don't seem to correlate well between the genotypes (iso vs Gla crosses, neither for SD-5 nor for SD-Mad). Is this because values are in general small and/or replicates don't correlate? Should these data even be considered? Also panels 2A and S5C are very different from each other. The additional comparison with the SD-MadRev allele crossed into both Iso1 and Gla should give additional insight. Experiment: totRNAseq for SD-MadRev crosses.

      The reviewer brings up a good point. While some repetitive elements had relatively small counts in the totRNA (like Rsp) most had adequately high counts. But these differences are to some degree expected. Although the other chromosomes are controlled for, the second chromosomes are different by design including the two SD haplotypes. In this context, similarities between the two haplotypes may be helpful in determining some unifying aspects of the SD mechanism but differences could be incidental to the genotype and not necessarily related to SD.

      It may be generally informative to set the sRNA and RNA comparisons into perspective, for example by including the comparison of SD-Mad crosses versus SD-MadRev crosses to exclude unrelated genetic background components as much as possible.

      The reviewer is correct here. Differences in the transcriptomes of SD-Mad and our revertant are much more likely due to the drive phenotype. Due to variation between SD-Mad total RNAseq replicates, we have substantially less power when comparing SD-Mad/Iso1 to SD-MadRev/Iso1. We therefore generated new data to address this point: we did digital gene expression for three biological replicates of SD-Mad/Iso-1 and SD-MadRev/Iso1. We described the results of this new analysis above.

      FigS6: I assume this is given, but as it is not specified: is the directionality of differential expression taken into account here? Or could it be significantly up in one and down in the other? Please specify / adjust color scale to allow this distinction.

      This is a good point. We have modified the figure to not only indicate significance but also direction and magnitude.

      FigS8: Please add a scale bar for all images. 1.688 is labeled as 359 in the legend, please unify or/and explain nomenclature. Consider adding a nuclear outline based on DAPI. It looks like 1.688 is actually more different between control and SD-Mad/Iso than Rsp. Could the authors comment on this? In the text the authors mention that these experiments were done for both SD-Mad and SD-5 heterozygotes, but only the SD-Mad data are shown.

      The most abundant component of 1.688 repeats is the 359bp repeat, which is used as a proxy for 1.688 and our 359-bp probe cross hybridizes with other abundant variants of 1.688 on chromosome 3. We agree, there does seem to be some differences in the 1.688 RNA FISH, however we do not yet have evidence that 1.688 is related to the drive phenotype. We have expanded that figure (now supplemental figure 7) with multiple images for each genotype to demonstrate the lack of change in Rsp and 1.688 localization. We have added an explanation of the nomenclature.

      The reference to SD-5 in the text was made in error. We do not have RNA FISH images of SD-5/Iso1 heterozygotes. We've modified the text to reflect this.

      FigS9B: What does the y-axis label mean? Fold change relative to what? Is this not displaying counts?

      This is a good catch by the reviewer. The y-axis is mislabeled and should read "TPM". We have made this change.

      To set the KipfKD/KO data in context, please give also the k value for SD-MadRev and compare the smRNA values in this context to the data displayed in F1B. Experiment: drive analysis for SD-MadRev.

      Our basis for concluding that Rsp smRNA overexpression may reduce drive strength is in demonstrating that kipfKO is sufficient to rescue wild type sperm in driving backgrounds. We did not introduce KipKD (or KO) to the SD-MadRev background because this chromosome does not drive.

      The note that the 3XP3-dsRed cassette needs to be flipped out for Rsp overexpression to influence drive is interesting. It would be great if the authors could show a more detailed scheme of the structure of this insertion including the directionality of the promoter relative to the Rsp fragment and the rest of cluster 38C (including dm6 coordinates perhaps). Small RNA sequencing compared to totRNA sequencing should reveal if the transcription or the processing into piRNAs of the inserted piece is affected, and if more of the 38C piRNAs are affected. Genic transcription has been previously observed to limit Rhino-dependent piRNA production from piRNA clusters (Andersen et al 2017). It might be of interest to the general piRNA community to see how cluster output is influenced through the integration of an internal genic promoter.

      We agree that this is an interesting result. We have added more detail to Fig 4A to indicate directionality and genomic location of the insert in terms of dm6.

      Figure panel 4A should be adjusted to include annotations of the black boxes and to give genomic locations. It is unclear what the blue brackets mean, and where exactly the insertion took place. Are the attP sites relevant for the experiments? It might be nice to see a piRNA profile over the locus, to put the levels of additional Rsp piRNAs into perspective.

      We have removed the black boxes from the schematic as they were only there as an aesthetic choice. We have indicated where exactly the insertion was made. The attP sites are there for future experimental flexibility.

      Minor comments

      Figure 3B: fold change of satellite RNA is shown. It might be obvious that the fold change relates to KipfKO / WT but this should be stated explicitly. What is the genetic background here?

      Thank you for the comment. We added information on the genetic background in the figure.

      Figure legends should be extended for clarity throughout the manuscript in main and supplementary figures. All color codes and abbreviations as well as samples / genotypes and assay used should be clearly explained. Few examples include: F1B: smRNA or totalRNA? F3B: fold change relative to what? F4B: what are these data relative to? F4C: smRNA or totalRNA? S2: Is this smRNAseq? Further description of the color code in the volcano panels would be desirable. FS3: typo in A-B should be A-D. Fold changes relative to what. Etc.

      Thank you for these helpful suggestions. We have edited the figure legends as suggested to improve the clarity. We appreciate the feedback.

      The abbreviation for Kipferl is kipf, not kip.

      Thank you for pointing this out, we have made the corrections.

      I don't understand the sentence on lines 310-312.

      We agree that sentence was confusing. We replaced it with:

      "Identifying potential proteins that interact with Rsp may therefore provide important clues about why satellites like Rsp are targets of drive."

      **Referee cross-commenting**

      I agree with the other reviewer's assessments

      Reviewer #3 (Significance (Required)):

      General assessment

      This study of a highly complex and poorly understood drive system adds a very interesting piece to the puzzle of understanding the interplay between a RanGAP duplication and a large satellite array. It's strengths lay in the use of genetics tricks to modify drive (SD-MadRev allele, KipfKO, Rsp cluster insertion). The main weakness of the study is the relatively low correlation of several observations between drive crosses to the Iso1 and Gla lines and lack of explanations thereof. Neither gene nor repeat expression seem to give a convincing overlap in any direction.

      Furthermore, it is interesting that SD-Mad and SD-5 have such different dependencies on Rsp sRNA. While outside the scope of this work, it would be very interesting to see how other drive haplotypes behave: is SD-5 the exception or is it SD-Mad (as the authors have also wondered in the discussion). Such additional comparisons may clarify also the discrepancies in RNAseq.

      Advance

      While it has been previously shown by the same group that Rsp satellites give rise to smRNAs through the piRNA pathway, it is to my knowledge unclear how and if these smRNAs influence drive. This study thus presents a conceptual advance in that it demonstrates that the role of Rsp smRNAs is not shared among driving haplotypes.

      Audience

      This study is relevant for a highly specialized audience interested in meiotic drive. It contributes to the understanding of the SD system and may serve as a basis for future research in this area. In addition, results reported in Figure 4 may be of peripheral interest for the Drosophila piRNA community for technical interests.

      This reviewers expertise: Drosophila, piRNA pathway, heterochromatin, sRNA

      This reviewers limitations: nuclear-cytoplasmic trafficking, cytoskeleton

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In the presented manuscript Edvalson and Wei et al use Drosophila genetics and NGS experiments to investigate the mechanism of meiotic drive through the Segregation Distorter (SD) system. They reveal that two driving haplotypes seem to function via different mechanisms, with drive through SD-Mad but not SD-5 involving small RNAs produced from the Responder (Rsp) satellite, the target of SD drive. SD-Mad testes displaying drive are characterized by lower levels of Rsp sRNAs compared to non-drive controls as well as SD-5, and the ectopic overexpression of Rsp sRNAs through two distinct mechanisms decrease drive in SD-Mad genetic background, specifically. With this work, the authors are adding an important piece of information to the highly complex SD system, indicating that sperm killing is likely achieved by different mechanisms in different SD haplotypes, despite sharing a common driver.

      Major comments

      Fig1C: It might be interesting to show the fold change between SD-Mad and SD-MadRev in addition to what is displayed. Moreover, can the authors comment on what might be causing the increased smRNA counts for 38C2? Is this because R16 has particularly low 38C2 values?

      Fig1/S1: Could the authors also display the Rsp smRNA counts for all Gla crosses similar to panel 1B? What is the interpretation for the increase in Rsp smRNAs in SD-5/Gla relative to R16/Gla but the lack of such an increase in the SD-5/iso1 vs R16/iso1 comparison? Do SD-Mad and SD-5 induce the same strength of drive against each of the two wildtype chromosomes? Experiments: smRNAseq for SD-MadRev/Gla.

      Fig1: The authors note changes in smRNA levels for other satellites as well as piRNA clusters but do not give any interpretation to this observation. Are they meaningful? Should they be attributed to genetic background?

      FigS2: Same question also for the deregulated TEs: do they share sequence features with Rsp or are they overrepresented in the clusters that change? Are these explained by differences in insertions between genotypes? Do their total RNAseq values change in any way? What do the percentages in line 162 correspond to? Number of TEs that are deregulated? At which cutoff? It might be informative to compare the data to a cross between driver and R16, or even better the SD-MadRev control. Experiments: totRNAseq for SD-MadRev crosses and optionally crosses to R16.

      Figure S3: Am I reading the PCA plots right in that there are very few gene expression changes when the drivers are in iso1 background but much more in the Gla background? Comment on possible explanations for that. Please indicate the number of significantly changed genes in each comparison. Again, are these changes correlated between the two drivers or can they be attributed to genetic background of Gla vs R16? Would it be interesting to see how SD-Mad/Gla and SD-5/Gla gene expression profiles compare? Experiment: totRNAseq for SD-MadRev crosses.

      In Figure S5A it seems that totalRNA levels of Rsp are strongly increased in SD-Mad/Gla but not in SD-Mad/iso1. The iso comparison (less piRNAs but same transcript) could indicate that it is actually transcription of the Rsp that is affected here. This is even pointed out in line 205 without discussion of the fact that the Gla comparison (less piRNAs but more transcript) would rather indicate that transcription is intact, but processing into piRNAs is defective. Could this be clarified using FISH as in Figure S8? If true, SD-Mad/Gla should have much more FISH signal than SD-Mad/iso1. Either way, this discrepancy should be further discussed. Experiments: comprehensive smFISH panel for all crosses (including SD-MadRev).

      In general, the totRNA profiles of repeats don't seem to correlate well between the genotypes (iso vs Gla crosses, neither for SD-5 nor for SD-Mad). Is this because values are in general small and/or replicates don't correlate? Should these data even be considered? Also panels 2A and S5C are very different from each other. The additional comparison with the SD-MadRev allele crossed into both Iso1 and Gla should give additional insight. Experiment: totRNAseq for SD-MadRev crosses.

      It may be generally informative to set the sRNA and RNA comparisons into perspective, for example by including the comparison of SD-Mad crosses versus SD-MadRev crosses to exclude unrelated genetic background components as much as possible.

      FigS6: I assume this is given, but as it is not specified: is the directionality of differential expression taken into account here? Or could it be significantly up in one and down in the other? Please specify / adjust color scale to allow this distinction.

      FigS8: Please add a scale bar for all images. 1.688 is labeled as 359 in the legend, please unify or/and explain nomenclature. Consider adding a nuclear outline based on DAPI. It looks like 1.688 is actually more different between control and SD-Mad/Iso than Rsp. Could the authors comment on this? In the text the authors mention that these experiments were done for both SD-Mad and SD-5 heterozygotes, but only the SD-Mad data are shown.

      FigS9B: What does the y-axis label mean? Fold change relative to what? Is this not displaying counts?

      To set the KipfKD/KO data in context, please give also the k value for SD-MadRev and compare the smRNA values in this context to the data displayed in F1B. Experiment: drive analysis for SD-MadRev.

      The note that the 3XP3-dsRed cassette needs to be flipped out for Rsp overexpression to influence drive is interesting. It would be great if the authors could show a more detailed scheme of the structure of this insertion including the directionality of the promoter relative to the Rsp fragment and the rest of cluster 38C (including dm6 coordinates perhaps). Small RNA sequencing compared to totRNA sequencing should reveal if the transcription or the processing into piRNAs of the inserted piece is affected, and if more of the 38C piRNAs are affected. Genic transcription has been previously observed to limit Rhino-dependent piRNA production from piRNA clusters (Andersen et al 2017). It might be of interest to the general piRNA community to see how cluster output is influenced through the integration of an internal genic promoter.

      Figure panel 4A should be adjusted to include annotations of the black boxes and to give genomic locations. It is unclear what the blue brackets mean, and where exactly the insertion took place. Are the attP sites relevant for the experiments? It might be nice to see a piRNA profile over the locus, to put the levels of additional Rsp piRNAs into perspective.

      Minor comments

      Figure 3B: fold change of satellite RNA is shown. It might be obvious that the fold change relates to KipfKO / WT but this should be stated explicitly. What is the genetic background here?

      Figure legends should be extended for clarity throughout the manuscript in main and supplementary figures. All color codes and abbreviations as well as samples / genotypes and assay used should be clearly explained. Few examples include: F1B: smRNA or totalRNA? F3B: fold change relative to what? F4B: what are these data relative to? F4C: smRNA or totalRNA? S2: Is this smRNAseq? Further description of the color code in the volcano panels would be desirable. FS3: typo in A-B should be A-D. Fold changes relative to what. Etc.

      The abbreviation for Kipferl is kipf, not kip.

      I don't understand the sentence on lines 310-312.

      Referee cross-commenting

      I agree with the other reviewer's assessments

      Significance

      General assessment

      This study of a highly complex and poorly understood drive system adds a very interesting piece to the puzzle of understanding the interplay between a RanGAP duplication and a large satellite array. It's strengths lay in the use of genetics tricks to modify drive (SD-MadRev allele, KipfKO, Rsp cluster insertion). The main weakness of the study is the relatively low correlation of several observations between drive crosses to the Iso1 and Gla lines and lack of explanations thereof. Neither gene nor repeat expression seem to give a convincing overlap in any direction.

      Furthermore, it is interesting that SD-Mad and SD-5 have such different dependencies on Rsp sRNA. While outside the scope of this work, it would be very interesting to see how other drive haplotypes behave: is SD-5 the exception or is it SD-Mad (as the authors have also wondered in the discussion). Such additional comparisons may clarify also the discrepancies in RNAseq.

      Advance

      While it has been previously shown by the same group that Rsp satellites give rise to smRNAs through the piRNA pathway, it is to my knowledge unclear how and if these smRNAs influence drive. This study thus presents a conceptual advance in that it demonstrates that the role of Rsp smRNAs is not shared among driving haplotypes.

      Audience

      This study is relevant for a highly specialized audience interested in meiotic drive. It contributes to the understanding of the SD system and may serve as a basis for future research in this area. In addition, results reported in Figure 4 may be of peripheral interest for the Drosophila piRNA community for technical interests.

      This reviewers expertise: Drosophila, piRNA pathway, heterochromatin, sRNA

      This reviewers limitations: nuclear-cytoplasmic trafficking, cytoskeleton

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      I have only one request: I found it unclear whether the authors were referring to small RNAs or their precursor (long RNA). By reading the text carefully, I could deduce that Fig1A/Table S2 represent the small RNA sequencing, while FigS3A represents total RNA seq (detecting precursor). However, the labeling in the Fig1A and Table S2 only says 'piRNA cluster' or 'Rsp' (without clarifying 'piRNA from piRNA cluster' or 'piRNA from Rsp'), and it took quite some time for me to understand which Fig/data is smallRNA vs. longRNA.

      Referee cross-commenting

      I agree with other reviewers' comments, which all seem to be reasonable.

      Significance

      This manuscript by Edvalson et al. describes their study on SD (segregation distorter) meiotic drive system, examining the role of piRNA derived from Rsp satellite. Although the exact mechanism of drive is still unknown, this study represents a significant step forward in understanding SD-mediated drive.

      By using two SD alleles (SD-5 and SD-Mad), they show that Rsp-derived piRNA is depleted in SD-Mad. The authors used total RNA sequencing/small RNA sequencing mutants and carefully designed controls (such as deletion of Sd-RanGAP) to reach the model that Rsp-derived piRNA is involved in SD-Mad-mediated drive. The result that kipferl depletion (that lead to sat DNA expression) rescues SD-Mad's drive phenotype is very interesting. This supports that the decreased Rsp piRNA indeed corresponds to SD-Mad-mediated drive. They further back up this idea by overexpressing Rsp.

      Interestingly, SD-5 was not impacted by changes in Rsp expression. Based on this result, the authors state that there are mechanistic variations in the same (SD) drive system. This statement is certainly justified by the data, but I cannot help wondering there might be a unifying mechanism that explains both SD-5 and SD-Mad. I am not suggesting to edit the manuscript or add the discussion: but do they have any speculations on this? For example, SD-5 is simply epistatic to Rsp piRNA production? For example, SD-RanGAP > SD-Mad (some gene on SD-Mad inversion) > Rsp piRNA production > SD-5 > sperm killing?

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Edvalson and colleagues use transcriptomics, cell biology and genetics to study variation between segregation distorter (meiotic drive) strains and find several important results. These include apparent suppression of small RNAs mapping to responder (the drive target) in one of the lines, a general pattern of differential expression consistent with the drive mechanism being upstream of sperm individualization (where defects have been seen previously), and genetic confirmation that perturbing Rsp expression can influence the strength of drive.

      Major comments: I found the total RNA sequencing experiment a bit oddly presented. This is partly because it was in the middle of the results (might fit better first), partly because few specific genes were discussed (this might be appropriate given then question, but maybe the question should be more clearly stated), and the complexity of the approach (WCGNA + PANGEA) and how it all fits together. I suggest working to clarify the main points of this section (which are a bit different than the main focus of the Rsp work).

      Minor comments:

      Abstract - Probably worth mentioning Sd-RanGAP here, even if you are using it as a straw man. I agree that the specific mechanism is not known, but some of the genetics are established.

      Line 80 and elsewhere - it would be helpful to be specific here - you are looking at both small and total RNA

      Fig 1B - is there a reason not to show the values of the replicates here? It would be more transparent.

      Line 139 - does the experimental design control for 1.688 genomic copy number? Where is it located?

      144-146 - this could be written clearer, and I think it should only refer to 1C, not 1B. Part of the issue is that there are several repeats not discussed, and it isn't clear what is happening with them. I suggest expanding this description so it is more clear.

      Line 161 - what do you mean (specifically) by "repetitive loci"?

      193-203 - This is an important finding that is somewhat lost in trying to keep track of WCGNA and PANGEA and the different Modules. I suggest clarifying to drive home the point that differential expression appears to start prior to individualization, which suggests and earlier mechanism of drive.

      Fig 3B & 3C, Fig 4 - same as 1B, is there a reason not to show the actual data points?

      Line ~245 - was the same experiment done with SD-5? (as you do below for Rsp overexpression)

      Referee cross-commenting

      I agree with the comments as well.

      Significance

      This is a strong paper that moves the field forward, even if it leaves questions still to be answered (why the difference between drivers? what is the mechanism? how is rsp interacting with drive?

      Several findings move the field forward: the Rsp small RNA results, the differential expression hinting at a molecular mechanism that is upstream of sperm individualization.

      The audience is moderately broad. Genetic conflict is gaining in general interest, but aspects of this will be mostly interesting to the hardcore drive crowd.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Minor comments 1) The authors suggest that the weak 4th protomer in the HCMV UL52 3-mer map is a consequence of flexibility. This may be the case, but it may also be the case that the class is polluted with 4-mer particles leading to reduced occupancy. Erasing the weak density and running a multi-model 3D classification providing the erased 3-mer and a 4-mer starting map may separate these.

      We performed additional analysis (i.e., 3-mer and 4-mer particles were combined into a multi-class ab initio reconstruction followed by multi-class heterogenous refinement) and found that the original 3-mer map was a mixture of 3-mer and 4-mer states.

      We have updated Fig. 2a, Supplementary Fig. 2, Supplementary Fig. 3, Supplementary Table 1, Supplementary Movie 1, and removed the discussion of the weak protomer in the 3-mer map from the results section. We have updated our EMDB and PDB depositions accordingly.

      • 2) I found the supplemental figure to show the DNA in the tripentamer map too small, this is an interesting finding and should be shown more clearly.*

      We have increased the size of Supplementary Fig. 6 and moved the figure caption to another page to accommodate this enlargement.

      Reviewer #2

      *Major issues 1) There is a high probability that the tripentamer is an artifact of the cross-linking. Because of this, it'd be great to know more about the cross-linking reaction, ideally mass spec identification and quantification of cross-links. This would also address the authors' speculation of contacts that stabilize the tripentamer. *

      Crosslinking is a commonly used technique to stabilize complexes that are observed through other means but do not survive the cryo-EM vitrification process. In an EMSA experiment (Supplementary Fig. 4a), UL32 binds 30 bp DNA and migrates slower than when bound to a 10 bp probe, consistent with formation of a supra-pentameric complex. The samples in the EMSA gels are not crosslinked. Additionally, an SDS-PAGE gel of the crosslinked product used for cryo-EM showed tight bands at molecular weights expected for oligomers, supporting specific crosslinking (Supplementary Fig. 4b). These results suggest that crosslinking stabilizes a species that can form but is relatively unstable in solution.

      Moreover, the author's claim "However, mutation of K532A/C535A reduced infectious virion production by half (Fig. 4b), suggesting that the tripentamer interface may play a role in the viral life cycle." Seems to be an overreach. Perhaps this is semantics but the data just show that these residues play a role in viral replication (albeit not a huge role based on the modest effect).

      We have modified the title of the results section (Line 216-217) to state that "Residues at the tripentamer interfaces contribute to infectious virion production in HSV-1" as well as Line 234 and 241 to indicate that the residues play a role in the viral life cycle.

      2) The density for the potential DNA does not look very convincing, although it still remains the strongest hypothesis. The authors should try to strengthen their argument. Does this putative DNA contact residues that they show are necessary for viral replication? Showing seq conservation on the structure could help their argument for the shared function of DNA-binding.

      The DNA likely contacts conserved residues at the base and midsection of the central channel (residues R302, R301, R293, K289, R580, R579, R572; see Fig. 6a). We have shown that these residues are important for the production of infectious virions (Fig. 6c): even a single point mutation (R572A) decreased production of infectious virus particles by more than 90%, and double and triple point mutants (R579A/R580A, K289A/R293A/R301A) eliminated production of infectious virus. Sequence conservation of these charged residues in the central channel regions is shown in Supplementary Fig. 1d, f.

      3) My last major issue is stylistic and concerns the descriptions of cryoEM structures. I found that the paper was a bit of challenge to read when the authors would introduce each structure. It was a bit of a slog to get through. Descriptions of the structures veered off into overly detailed comparisons that required constant comparison with the figure and didn't really advance my understanding past "the outer surfaces of the three orthologs are different." This masked the more interesting aspects of the authors' findings. Perhaps this could be summarized in supplementary figures or a table. Because this is a stylistic suggestion, the authors should feel free to ignore this request.

      We appreciate the reviewer's concerns about accessibility, but we are excited that these structures allowed us to thoroughly describe the convergent and divergent structural features across the Herpesviridae and hope that our in-depth analysis will allow for detailed mechanistic follow-up.

      *Minor comments 1) The descriptions of structure determination in the text were often unclear. For example, "In the 3-mer map, a poorly-resolved fourth protomer is visible at low contour levels, suggesting that an additional protomer is present but highly flexible in this class (Supplementary Fig. 3a)." Alternatively, it could be that the classification algorithm wasn't able to fully separate particles that were 3-mers from the 4mers. *

      The reviewer is correct. As described above (Reviewer #1 comment 1), we performed additional analysis and found that the original 3-mer map was a mixture of 3-mer and 4-mer states. We have updated Fig. 2a, Supplementary Fig. 2, Supplementary Fig. 3, Supplementary Table 1, Supplementary Movie 1, the EMDB and PDB depositions, and removed the discussion of the weak protomer in the 3-mer map from the results section.

      *When describing the structure determination of the HSV1 accessory factor, the authors describe no other particles other than the tripentamer. Were there other particles observed? It'd be a bit surprising that all of the protein adopted the tripentamer state. *

      We agree that this result is striking. We picked particles using a 'blob picker' to avoid introducing template bias and found that the tripentamer is the predominant species. Below we show the results of 2D classification of blob picked particles (classes sorted by particle number; obvious junk classes excluded for clarity). There is one class that suggests a pentamer, but template picking with a pentamer template (based on ORF68) did not yield a pentamer class.

      Additionally, as we describe in the results section and show in Supplementary Fig. 6a, further processing of the consensus UL32 map showed that 60% of particles formed a complete tripentamer (i.e., 15-mer) while other the remaining 40% formed incomplete tripentamers, missing one or more protomers (e.g., 17% of particles formed a 14-mer).

      Was symmetry applied, particularly for the tripentamer that appears to have C-3 symmetry? This is in materials and methods but not clear why it isn't mentioned when describing the structure determination and results.

      No symmetry was applied in the reconstruction for either UL32 or UL52. While we previously noted this in the methods section and in Supplementary Table 1, we have added this information to the results section (Line 169-170), the Fig. 3 legend, and cryo-EM processing figures (Supplementary Figures 2, 5, 6) for clarity.

      2) Throughout the paper, the authors use the word "remodel" to describe structural differences between orthologs. However, this word usually carries the implication of conformational rearrangement within a protein, and not across orthologs. Please consider a different description.

      We agree with the reviewer and have removed the term "remodel" throughout the manuscript text (i.e., Lines 116, 118, 120, 122, 302, 306) and from Supplementary Figures 1, 3, and 5.

      3) Figure 2F is confusing and difficult to interpret. It seems that the main point is that these interfaces are conserved, which might be more easily displayed as a standard sequence conservation score mapped onto the structure. I'm also not sure that this figure is necessary as a main figure and could be supplemental.

      We agree that the conservation could also be shown this way and have added labels to universally conserved residues of the protomer interface to Supplementary Fig. 1b, c. We have also moved Fig. 2f to the supplement (now Supplementary Fig. 2g).

      • 4) The authors write "UL32 bound to the shortest probe tested (10 bp, Supplementary Fig. 4a)." This implies that ONLY the shortest probe is bound and that others are not bound. Consider rephrasing.*

      We have rephrased to clarify at all probes tested, included the shortest, bound DNA (Line 153).

      • 5) Frustum is misspellt. ;)*

      Thank you. Spelling has been corrected (Line 185).

      6) In the discussion, the authors speculate that the variability of the outer surface is due to "virus- or host-specific interactions". I'm confused by "host-specific interactions", because the host is the same for all three viruses. Perhaps the authors mean that the different accessory factors could interact with different host factors? If so, are the authors making a Red Queen argument? If so, it'd be pretty cool to do dN/dS analysis to test that hypothesis.

      The reviewer is correct in that all three viruses (HSV-1, HCMV, KSHV) infect the same host; however, they replicate in different cell types, which could potentially express different host factors. We have no evidence to support this hypothesis and intended to propose that UL32 and UL52 may be interacting/co-evolving with other viral factors required for genome packaging. We have clarified Line 308 to generalize that "these regions are involved in virus-specific interactions".

      To me, this window into evolution of this factor is the biggest advance of the work, and tbh I felt that the authors could lean into this a bit more in the discussion section. Are there any differences in the packaging mechanisms of the different herpes families that can be related to their different behavior? Any other molecular evolution analyses (e.g. dN/dS ratio analysis) that could inform their study?

      We agree that understanding the evolution of the packaging accessory factor is an interesting future area of research. There are differences in capsid structure and occupancy of capsid-associated factors across the herpesvirus family (PMID: 34696343). However, we lack a mechanistic (or structural) understanding of viral genome packaging components across the herpesviruses, raising the possibility that there are differences in packaging mechanisms.

      Interestingly, the further diverged alloherpesviruses and malacoherpesviruses (other families in the order Herpesvirales) do not appear to encode a factor with similar predicted structure to the Herpesviridae packaging accessory factor (PMID: 41902279). It is unclear how the mechanism of packaging differs in the Orthoherpesviridae and whether replication in mammalian/avian/reptilian cells places additional evolutionary pressure on the viral genome packaging mechanism.

      Reviewer #3

      Major comments

      *1) [I]t is not clear whether the structures presented in the manuscript reflect those produced during HCMV or HSV-1 infection. *

      We agree with the reviewer that it is important to consider to what extent purified biomolecules resemble their in vivo counterparts. This criticism can be applied to any ex situ structural analysis. However, our experimental structures allowed us to make testable observations, including the correct assignment of structurally important zinc fingers and the identification of functionally important residues in the central channel.

      2) HCMV UL52 was presented to form two distinct structures, a 3-mer and a 4-mer (Fig. 2a). However, the authors acknowledge that the 3-mer is actually a 4-mer when the threshold for the cryo-EM map is lowered. The density is also visible in the PDB validation report for the 3-mer; EMD-74418.

      Reviewers #1 and #2 were also curious about the 3-mer. As described above, we performed additional analysis that showed that the original 3-mer map was a mixture of 3-mer and 4-mer states. We have updated Fig. 2a, Supplementary Fig. 2, Supplementary Fig. 3, Supplementary Table 1, Supplementary Movie 1, EMDB and PDB depositions, and removed the discussion of the weak protomer in the 3-mer map from the results section.

      *Given that ORF68, BFLF1, and UL32 (Didychuk et al., 2021) form complete pentamer rings, with BFLF1 forming stacked rings, it would seem odd for a protein with conserved function to deviate from a pentamer configuration, suggesting that the structures reported do not reflect the natively produced and functional protein. *

      We agree that this is a surprising finding; we initially anticipated that UL32 and UL52 would also form stable pentameric rings. While this study does not resolve a complete mechanism for this factor, it does provide the first structural evidence for the implications of their poor sequence conservation and lack of complementarity.

      Furthermore, this is not the first example of a conserved herpesvirus factor that possesses different oligomeric states across different subfamily homologs. As mentioned in the discussion, herpesvirus encode a sliding clamp processivity factor (HSV-1 UL42/HCMV UL44/KSHV ORF59) that shares a common PCNA-like fold, but which has varied oligomeric state across these herpesviruses.

      *3) Unlike ORF68 (Didychuk et al., 2021) and UL32 (Suppl. Fig. 4), dsDNA binding experiments were not performed with UL52. Could the partial pentamers simply be poorly formed due to expression in insect cells (mammalian cells were used for protein purification in Didychuk et al., 2021), absence of dsDNA, or inappropriate buffer conditions? Moreover, were the EM grid and vitrification parameters optimized? Grid geometries and chemistries can have profound effects of protein stability especially in the context of the air-water interface, leading to degradation of protein complexes (Glaeser, 2018; D'Imprima et al., 2019). Does UL52 form complexes with dsDNA? Data are shown for the HSV-1 packaging accessory factor. Perhaps dsDNA would stabilize the UL52 pentamer. *

      We have purified ORF68 and homologs from both human and insect cell expression systems, and do not observe changes in oligomeric behavior. We find that ORF68 purified as a stable pentamer from human cells (Didychuk eLife 2021) and from insect cells (this work). We have also recombinantly expressed and purified UL32 from human cells. UL32 was largely monomeric after strep affinity purification (chromatogram below, unpublished), as we report from insect cells (this work, Fig. 1c). We switched to insect cell expression systems because of the easier scalability.

      Our SEC-MALS data (Fig. 1d) shows that purified UL52 does not oligomerize into a pentamer in solution, so the observed sub-pentameric (3-mer/4-mer) assemblies are unlikely to be an artifact of cryo-EM freezing conditions or the air-water interface. We have not tested if UL52 forms complexes with dsDNA, although it likely does; it is possible that this interaction would stabilize a pentamer.

      4) In Didychuk et al., 2021, HSV UL32 is shown to form pentameric rings; negative stained 2D class averages were generated from tagged protein (twin strep tag), produced in mammalian cells (HEK293T), and not purified using size exclusion chromatography. In the present study HSV UL32 was not observed to form pentameric complexes "We first attempted to visualize the pentameric species by negative stain electron microscopy but were unable to identify particles of the expected dimensions." However, it is not clear why this was the case. If the pentameric structures were readily produced in previous experiments, why was cross-linking needed in the current study? As such, the tripentamer complexes seem artifactual in nature.

      While a sufficient number of particles were observed in a pentameric state to do 2D class averages in the eLife paper, this was not the dominant state. The results we report in this work are consistent with those reported in the eLife paper. Reviewer #2 (comment #1) was also concerned about the possibility of a crosslinking artifact: we reproduce our response below:

      "Crosslinking is a commonly used technique to stabilize complexes that are observed through other means but do not survive the cryo-EM vitrification process. In an EMSA experiment (Supplementary Fig. 4a), UL32 binds 30 bp DNA and migrates slower than when bound to a 10 bp probe, consistent with formation of a supra-pentameric complex. The samples in the EMSA gels are not crosslinked. Additionally, an SDS-PAGE gel of the crosslinked product used for EM showed tight bands, supporting specific crosslinking (Supplementary Fig. 4b). These results suggest that crosslinking stabilizes a species that can form but is relatively unstable in solution."

      We have updated Line 148 to clarify this. We have also included a negative stain micrograph, below, in which UL32 pentamers (purified from insect cells) are visible in the absence of crosslinking.

      5) Although the data presented in Fig. 4b suggest that interface residues, K532 and C535, might play a role in the formation of the tripentamer and have a minor role in HSV-1 replication, these experiments are incomplete. Single mutations are needed for each residue to assess their individual contribution to tripentamer formation, evidence for a loss of tripentamer formation is needed, and evidence for protein expression is needed.

      We agree that we have not unambiguously defined the role of the tripentamer, the precise contributions of residues K532 and C535, or defined the contribution of the tripentamer to HSV-1 viral replication. We seek to report this novel structure to lay the basis for future mechanistic work. Reviewer #2 (comment 1) also questioned the role of these residues in HSV-1 replication, and we addressed this by modifying the title of the results section (Line 216) to state that "Residues at the tripentamer interfaces contribute to infectious virion production in HSV-1" as well as Line 246 and 253 to indicate that the residues play a role in the viral life cycle.

      Please refer to Supplementary Fig. 7e for a western blot showing that these mutants do not impact UL32 expression. We included explicit references to UL32 expression on Lines 239 and 288.

      *6) In the previous negative stain electron micrographs reported by Didychuk et al., 2021, were the higher order tripentamer complexes seen? *

      We did not observed tripentamers in the Didychuk et al. 2021 negative dataset. Tripentamer formation may be concentration dependent. Negative stain EM carried out at nanomolar concentrations would likely cause dissociation of tripentamers, but cryo-EM and EMSA in our work were carried out at micromolar concentrations and were able to capture the higher order tripentamer.

      • 7) Formation of disulphide bonds between cysteine residues in vitro is not indicative of complexes forming in vivo during replication. What evidence is there for disulphide bond formation between packaging accessory factor pentamers for KSHV, EBV, and LCMV? In the present study, the disulphide bond could form due to proximity as a result of the cross-linking and the presence of molecular oxygen rather than a bona fide enzyme catalysed reaction during herpesvirus replication to generate packaging accessory factor tripentamers. *

      We agree that it is unlikely that disulfide bonds form during infection and have removed this speculation from the manuscript (Line 343-346).

      8) The DNA densities in Suppl. Fig. 6e to 6g are curious. As noted by the authors, the 30mer dsDNAs do not traverse through the central cavity of the pentamer. They appear to make contact with neighboring pentamers, again suggesting that these complexes are artefacts from cross-linking. This should be discussed more thoroughly.

      Please refer to above discussion of crosslinking and Supplementary Fig. 4.

      9) Previously proposed functional roles for ORF68 include a scaffold for terminase assembly, association of the terminase with the portal, generation of initial free ends, or coordination with other replication machinery (Didychuk et al., 2021). Presuming that the new structures for HCMV UL52 and HSV-1 UL32 occur naturally, how do they fit with the previously proposed functional roles of the herpesvirus packaging accessory factor? A more in-depth discussion of this would be valuable.

      The common core fold and pentamer/pentamer-like assemble are common features, as is the conserved, positively-charged central channel. We have added additional discussion of this.

      *Minor comments A lack of page numbers and line numbers made reviewing this manuscript more challenging than necessary. *

      We have included page numbers and line numbers in the revised manuscript.

      *As noted in the 'General comments' section above, ORF68 (3.37Å) and BFLF1 (3.60Å) both form pentamers (Didychuk et al., 2021) and were produced in mammalian systems HEK293T cells. Protein purification in the present study was performed in insect (SF9 or High Five) cells. Does this affect complex stability. Also, the tag was retained for UL32 in Didychuk et al., 2021; could this provide stability of the pentamer in the original studies? *

      As discussed above, we have no evidence to suggest that expression in human vs. insect cell expression systems dramatically changes oligomerization behavior (Reviewer #3, comment 3). N-terminal purification tags were also retained in this study for structural work but were removed for SEC-MALS, which shows that UL32 is likely in concentration dependent equilibrium between (unstable) pentamers and monomers.

      Suppl. Fig. 3 is missing.

      We apologize for this oversight and have included Supplementary Fig. 3.

      *"UL52 has two regions remodeled" The use of the word 'remodeled' is not appropriate in this context as it implies a single protein can form two shapes under different conditions rather than distinct structures between two disparate proteins; UL52 compared to ORF68. This should be rephrased. *

      This was also noted by reviewer 2, and we have removed the term "remodel" throughout the manuscript text (i.e., Lines 134, 138, 140, 337, 341) and from Supplementary Figures 1, 3, and 5.

      *What is the density in the central core of UL52 (Fig. 2a; Suppl. Fig. 2e)? Was any form of focused classification performed to establish the identity of the density within the central pseudocavity? *

      As noted in the manuscript, this density could be which could be attributed to co-purified protein or nucleic acid, or part of the unresolved, negatively charged loop (residues 82-181) interacting with the positively charged central channel. We have done additional analysis of the central channel density (3D classification with a focus mask) and do not resolve any distinct densities, suggesting that the density is very dynamic.

      *Does UL52 bind to dsDNA? To support the hypothesis that the herpesvirus packaging accessory factor has conserved functions across the three subfamilies dsDNA binding experiments should be performed. *

      We have not done this experiment. We think that demonstrating this finding for two of the three herpesvirus subfamilies is sufficient.

      There is no discussion about how these data relate to the previous functional model for ORF68 presented in Didychuk et al., 2021. Do the new data alter the previous functional models?

      The precise mechanistic contribution of the packaging accessory factor remains unknown, and our data do not delineate between the proposed potential roles described in Didychuk et al. 2021. Importantly, our structural information, demonstration of pentameric ring formation, and significance of the positively charged central channel show that the core function of this factor is likely conserved across the virus family. This was not known before our work.

      *There are some interesting grammatical phrases; please address throughout the manuscript. One example - "...a notable shared aspiration..." Proteins do not have aspirations. Please use a more formal scientific statement. *

      We have updated the language on Line 327.

      *Fig. 4b - Statistical analyses missing. Please provide. *

      Fig. 6c - Statistical analyses are missing. Please provide. Protein folding/expression data missing; see Fig. 5C showing mutations that result in poor protein expression.

      Suppl. Fig. 7f - Statistical analyses absent.

      Statistical analysis of the viral complementation in Figs. 4b and 6c has been included. Note that the viral yields reported in Supplementary Fig. 7f were used to calculate complementation efficiency in Figs. 4b and 6c. Protein expression of mutants shown in Fig. 6c was previously included in Supplementary Fig. 7e and is referenced on Lines 288 and 293.

      *Suppl. Fig. 2 and 5 - FSC curves have oddities, especially in the corrected curves. The cryo-EM resolution estimates calculated by CryoSPARC for the UL52 '3-mer' and 4-mer, and UL32 tripentamer are likely overestimated. In the PDB validation files each of the deposited structures has a warning for the resolution estimate "The value from deposited half-maps intersecting FSC 0.143 CUT-OFF 4.31 differs from the reported value 3.32 by more than 10 %", suggesting that the resolution estimates are inaccurate. The authors should provide a resolution estimate using loose masks and generate FSC curves using another software program such as RELION's postprocess to provide resolution estimates. *

      Thank you for bringing this to our attention. The differences in the resolution estimates are a known issue and are highly influenced by the tightness of the mask. In the revised manuscript we have updated the FSC curves to not include auto-tightened masks and revised our resolution estimates. This slightly changed the resolution to 3.29 Å for both UL52 3-mer and 4-mer and to 3.09 Å for the UL32 consensus map. Please also see the local resolution estimation maps in Supplementary Figures 2e and 5e for an illustration of the range of resolutions in each map.

      Suppl. Fig. 6f and 6g - Is there any visible density that might resemble the EGS crosslinking reagent?

      We do not expect to observe density for EGS due to the long flexible linker (~16 Å) between the two reactive groups.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary.

      The manuscript describes the cryo-EM structures of a conserved, necessary, herpesvirus genome packaging accessory factor for human cytomegalovirus (HCMV), UL52, and herpes simplex virus type-1 (HSV-1), UL32. Herpesvirus packaging accessory factors have unknown function but bind dsDNA. The UL52 and UL32 structures revealed a 5-fold symmetry similar to the previous X-ray crystallography structure for Kaposi's Sarcoma-associated herpesvirus (KSHV) ORF68 and the cryo-EM structure of Epstein-Barr virus (EBV) BFLF1. However, HCMV UL52 was reported to form two structures, a 3-mer and 4-mer whereas, HSV UL32 formed a supercomplex of trimeric pentamers (tripentamer) produced by dsDNA binding and crosslinking. Similar to previous studies with ORF68, mutagenesis of HSV-1 UL32 demonstrated the importance of zinc finger residues C297, C308, C544, and H581 for core fold stability and positively charged residues H563, R572 in the central channel in the pentamer for HSV-1 recovery in virus complementation assays. In addition, mutagenesis of K532 and C535 at the tripentamer interface helix reduced virus complementation by 50%. These findings have significant overlap and similarities to previously published experiments and confirm the properties of ORF68 and BFLF1, demonstrating the conserved nature of the required packaging accessory factor for herpesviruses.

      Major comments.

      The manuscript is generally well written with beautifully presented cryo-EM figures. Unfortunately, the new data seem to muddy the water rather than provide clarification about the role or function of the herpesvirus packaging accessory factor. Furthermore, it is not clear whether the structures presented in the manuscript reflect those produced during HCMV or HSV-1 infection. HCMV UL52 was presented to form two distinct structures, a 3-mer and a 4-mer (Fig. 2a). However, the authors acknowledge that the 3-mer is actually a 4-mer when the threshold for the cryo-EM map is lowered. The density is also visible in the PDB validation report for the 3-mer; EMD-74418. Given that ORF68, BFLF1, and UL32 (Didychuk et al., 2021) form complete pentamer rings, with BFLF1 forming stacked rings, it would seem odd for a protein with conserved function to deviate from a pentamer configuration, suggesting that the structures reported do not reflect the natively produced and functional protein. Unlike ORF68 (Didychuk et al., 2021) and UL32 (Suppl. Fig. 4), dsDNA binding experiments were not performed with UL52. Could the partial pentamers simply be poorly formed due to expression in insect cells (mammalian cells were used for protein purification in Didychuk et al., 2021), absence of dsDNA, or inappropriate buffer conditions? Moreover, were the EM grid and vitrification parameters optimized? Grid geometries and chemistries can have profound effects of protein stability especially in the context of the air-water interface, leading to degradation of protein complexes (Glaeser, 2018; D'Imprima et al., 2019). Does UL52 form complexes with dsDNA? Data are shown for the HSV-1 packaging accessory factor. Perhaps dsDNA would stabilize the UL52 pentamer.

      In Didychuk et al., 2021, HSV UL32 is shown to form pentameric rings; negative stained 2D class averages were generated from tagged protein (twin strep tag), produced in mammalian cells (HEK293T), and not purified using size exclusion chromatography. In the present study HSV UL32 was not observed to form pentameric complexes "We first attempted to visualize the pentameric species by negative stain electron microscopy but were unable to identify particles of the expected dimensions." However, it is not clear why this was the case. If the pentameric structures were readily produced in previous experiments, why was cross-linking needed in the current study? As such, the tripentamer complexes seem artifactual in nature. Although the data presented in Fig. 4b suggest that interface residues, K532 and C535, might play a role in the formation of the tripentamer and have a minor role in HSV-1 replication, these experiments are incomplete. Single mutations are needed for each residue to assess their individual contribution to tripentamer formation, evidence for a loss of tripentamer formation is needed, and evidence for protein expression is needed. In the previous negative stain electron micrographs reported by Didychuk et al., 2021, were the higher order tripentamer complexes seen?

      Formation of disulphide bonds between cysteine residues in vitro is not indicative of complexes forming in vivo during replication. What evidence is there for disulphide bond formation between packaging accessory factor pentamers for KSHV, EBV, and LCMV? In the present study, the disulphide bond could form due to proximity as a result of the cross-linking and the presence of molecular oxygen rather than a bona fide enzyme catalysed reaction during herpesvirus replication to generate packaging accessory factor tripentamers.

      The DNA densities in Suppl. Fig. 6e to 6g are curious. As noted by the authors, the 30mer dsDNAs do not traverse through the central cavity of the pentamer. They appear to make contact with neighboring pentamers, again suggesting that these complexes are artefacts from cross-linking. This should be discussed more thoroughly.

      Previously proposed functional roles for ORF68 include a scaffold for terminase assembly, association of the terminase with the portal, generation of initial free ends, or coordination with other replication machinery (Didychuk et al., 2021). Presuming that the new structures for HCMV UL52 and HSV-1 UL32 occur naturally, how do they fit with the previously proposed functional roles of the herpesvirus packaging accessory factor? A more in-depth discussion of this would be valuable.

      Minor comments.

      A lack of page numbers and line numbers made reviewing this manuscript more challenging than necessary.

      As noted in the 'General comments' section above, ORF68 (3.37Å) and BFLF1 (3.60Å) both form pentamers (Didychuk et al., 2021) and were produced in mammalian systems HEK293T cells. Protein purification in the present study was performed in insect (SF9 or High Five) cells. Does this affect complex stability. Also, the tag was retained for UL32 in Didychuk et al., 2021; could this provide stability of the pentamer in the original studies?

      Suppl. Fig. 3 is missing.

      "UL52 has two regions remodeled" The use of the word 'remodeled' is not appropriate in this context as it implies a single protein can form two shapes under different conditions rather than distinct structures between two disparate proteins; UL52 compared to ORF68. This should be rephrased.

      What is the density in the central core of UL52 (Fig. 2a; Suppl. Fig. 2e)? Was any form of focused classification performed to establish the identity of the density within the central pseudocavity?

      Does UL52 bind to dsDNA? To support the hypothesis that the herpesvirus packaging accessory factor has conserved functions across the three subfamilies dsDNA binding experiments should be performed. There is no discussion about how these data relate to the previous functional model for ORF68 presented in Didychuk et al., 2021. Do the new data alter the previous functional models?

      There are some interesting grammatical phrases; please address throughout the manuscript. One example - "...a notable shared aspiration..." Proteins do not have aspirations. Please use a more formal scientific statement.

      Fig. 4b - Statistical analyses missing. Please provide.

      Fig. 6c - Statistical analyses are missing. Please provide. Protein folding/expression data missing; see Fig. 5C showing mutations that result in poor protein expression.

      Suppl. Fig. 2 and 5 - FSC curves have oddities, especially in the corrected curves. The cryo-EM resolution estimates calculated by CryoSPARC for the UL52 '3-mer' and 4-mer, and UL32 tripentamer are likely overestimated. In the PDB validation files each of the deposited structures has a warning for the resolution estimate "The value from deposited half-maps intersecting FSC 0.143 CUT-OFF 4.31 differs from the reported value 3.32 by more than 10 %", suggesting that the resolution estimates are inaccurate. The authors should provide a resolution estimate using loose masks and generate FSC curves using another software program such as RELION's postprocess to provide resolution estimates.

      Suppl. Fig. 6f and 6g - Is there any visible density that might resemble the EGS crosslinking reagent?

      Suppl. Fig. 7f - Statistical analyses absent.

      References.

      Didychuk AL, Gates SN, Gardner MR, Strong LM, Martin A, Glaunsinger BA. A pentameric protein ring with novel architecture is required for herpesviral packaging. Elife. 2021 Feb 8;10:e62261. doi: 10.7554/eLife.62261. PMID: 33554858; PMCID: PMC7889075.

      D'Imprima E, Floris D, Joppe M, Sánchez R, Grininger M, Kühlbrandt W. Protein denaturation at the air-water interface and how to prevent it. Elife. 2019 Apr 1;8:e42747. doi: 10.7554/eLife.42747. PMID: 30932812; PMCID: PMC6443348.

      Gardner MR, Glaunsinger BA. Kaposi's Sarcoma-Associated Herpesvirus ORF68 Is a DNA Binding Protein Required for Viral Genome Cleavage and Packaging. J Virol. 2018 Jul 31;92(16):e00840-18. doi: 10.1128/JVI.00840-18. PMID: 29875246; PMCID: PMC6069193.

      Glaeser RM. PROTEINS, INTERFACES, AND CRYO-EM GRIDS. Curr Opin Colloid Interface Sci. 2018 Mar;34:1-8. doi: 10.1016/j.cocis.2017.12.009. Epub 2017 Dec 22. PMID: 29867291; PMCID: PMC5983355.

      Significance

      General assessment: The strengths of this manuscript are the structural information provide by the cryo-EM maps for the HCMV UL52 and HSV-1 UL32 and the mutagenesis studies that corroborate previous studies for the packaging accessory factor for gammaherpesviruses KSHV and EBV. However, there are limitations. These are centered on whether the structures are representative of UL52 and UL32 complexes produced during replication rather than over expression in insect cells and stabilization using chemical cross-linking.

      There is a lack of novelty in the context of the herpesvirus packaging factor. The pentameric architecture, DNA binding, zinc fingers (4), and charged residues required for DNA binding were conclusively demonstrated in previous studies (Gardner and Glaunsinger, 2018; Didychuk et al., 2021). Thus, the novelty comes from the different pentameric structures; UL52 4-mer and UL32 tripentamer. However, if these are artefactual structures due to the expression system (mammalian versus insect) used, air-liquid interface induced protein instability, or cross-linking, the novelty is lost. That's not to say the data are not informative for the herpesvirus community.

      Advance: The advance in this manuscript is the new structural information for the UL52 and UL32. Even if the higher order complexes are potential artefacts, high resolution structure information for the subunit is especially informative. The mutagenesis data for UL32 are also informative in that the provide important information about a conserved and necessary protein needed for herpesvirus replication and has the potential to be used as a novel druggable target.

      Audience: The manuscript will appeal to specialized and broad audiences and could influence research into antiviral therapies for herpesviruses. My field of expertise is herpesvirology, structural biology, and cryogenic electron microscopy modalities,

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Bailey et al investigate DNA packaging accessory factors of various herpesviruses. The central findings are the cryo-EM structures of the accessory factors from HSV1 and HCMV. Combined with the corresponding author's previous structure of the KSHV accessory factor, these new findings now provide a window into packaging in all three families of human herpesviruses. They reveal that the overall structure of a ring of pentameric symmetry is conserved, the overall oligomeric stabilities are not conserved across all herpesviruses. Moreover, the authors have the important finding that basic residues in the ring pore are required for viral replication. Overall, this study represents a strong extension of the authors' study of packaging accessory proteins, with solid data and very few concerns to be addressed.

      Major issues.

      1) There is a high probability that the tripentamer is an artifact of the cross-linking. Because of this, it'd be great to know more about the cross-linking reaction, ideally mass spec identification and quantification of cross-links. This would also address the authors' speculation of contacts that stabilize the tripentamer. Moreover, the author's claim "However, mutation of K532A/C535A reduced infectious virion production by half (Fig. 4b), suggesting that the tripentamer interface may play a role in the viral life cycle." Seems to be an overreach. Perhaps this is semantics but the data just show that these residues play a role in viral replication (albeit not a huge role based on the modest effect).

      2) The density for the potential DNA does not look very convincing, although it still remains the strongest hypothesis. The authors should try to strengthen their argument. Does this putative DNA would contact residues that they show are necessary for viral replication? Showing seq conservation on the sturcutre could help their argument for the shared function of DNA-binding.

      3) My last major issue is stylistic and concerns the descriptions of cryoEM structures. I found that the paper was a bit of challenge to read when the authors would introduce each structure. It was a bit of a slog to get through. Descriptions of the structures veered off into overly detailed comparisons that required constant comparison with the figure and didn't really advance my understanding past "the outer surfaces of the three orthologs are different." This masked the more interesting aspects of the authors' findings. Perhaps this could be summarized in supplementary figures or a table. Because this is a stylistic suggestion, the authors should feel free to ignore this request.

      Minor comments

      1) The descriptions of structure determination in the text were often unclear. For example, "In the 3-mer map, a poorly-resolved fourth protomer is visible at low contour levels, suggesting that an additional protomer is present but highly flexible in this class (Supplementary Fig. 3a)." Alternatively, it could be that the classification algorithm wasn't able to fully separate particles that were 3-mers from the 4mers. When describing the structure determination of the HSV1 accessory factor, the authors describe no other particles other than the tripentamer. Were there other particles observed? It'd be a bit surprising that all of the protein adopted the tripentamer state. Was symmetry applied, particularly for the tripentamer that appears to have C-3 symmetry? This is in materials and methods but not clear why it isn't mentioned when describing the structure determineation and results.

      2) Throughout the paper, the authors use the word "remodel" to describe structural differences between orthologs. However, this word usually carries the implication of conformational rearrangement within a protein, and not across orthologs. Please consider a different description.

      3) Figure 2F is confusing and difficult to interpret. It seems that the main point is that these interfaces are conserved, which might be more easily displayed as a standard sequence conservation score mapped onto the structure. I'm also not sure that this figure is necessary as a main figure and could be supplemental.

      4) The authors write "UL32 bound to the shortest probe tested (10 bp, Supplementary Fig. 4a)." This implies that ONLY the shortest probe is bound and that others are not bound. Consider rephrasing.

      5) Frustum is misspellt. ;)

      6) In the discussion, the authors speculate that the variability of the outer surface is due to "virus- or host-specific interactions". I'm confused by "host-specific interactions", because the host is the same for all three viruses. Perhaps the authors mean that the different accessory factors could interact with different host factors? If so, are the authors making a Red Queen argument? If so, it'd be pretty cool to do dN/dS analysis to test that hypothesis.

      Significance

      This paper represents an advance in the field of genome packaging. The herpesvirus packaging mechanism is still mysterious, and the role of this accessory factor is one of the biggest gaps in knowledge. Although this study doesn't uncover the role, this provides new details into the evolution of this factor across the herpesvirus lineages. To me, this window into evolution of this factor is the biggest advance of the work, and tbh I felt that the authors could lean into this a bit more in the discussion section. Are there any differences in the packaging mechanisms of the different herpes families that can be related to their different behavior? Any other molecular evolution analyses (e.g. dN/dS ratio analysis) that could inform their study?

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Bailey et al. present the results of their structural analysis of packaging factors encoded by the β-herpesvirus human cytomegalovirus (UL52) and the ⍺-herpesvirus herpes simplex virus type 1 (UL32). The authors have previously published structures for orthologous proteins in the γ-herpesviruses Kaposi sarcoma associated herpes virus (ORF68) and Epstein-Barr virus (BFLF-1), showing both to form pentameric rings having a positively charged central channel. Here HCMV UL52 is found to form 3-mer and 4-mer assemblies that resemble incomplete pentameric rings. The complexes are formed by a screw displacement however, having both rotation about- and translation along the central axis. The structure shows fold conservation with the previously described structures, including preservation of the positively charged central channel.

      Attempts to image HSV-1 UL32 were initially unsuccessful, despite light-scattering analysis indicating the presence of pentamers. DNA binding is shown by EMSA, but these complexes were also not stable for cryo-EM analysis. Chemical crosslinking of the DNA bound complex was therefore employed, resulting in production of higher-order assemblies including one comprising three pentamers, that was successfully resolved by cryo-EM. Interestingly focussed classification analysis highlighted the presence of rod-shaped density passing through the central channels of two pentameric rings in this complex. Mutation of the interface that gives rise to the formation of tripentamers reduced progeny virion production by half, leading the authors to suggest that this complex may be a biologically important assembly.

      The importance of zinc-fingers identified in these structures was probed showing that mutation abolishes protein production. Similarly, mutation of the positively charged residues lining the central channel of HSV-1 UL32 greatly reduced or completely ablated progeny virion production in an assay where either WT or mutant UL32 was transfected into cells to complement UL32 knockout virus.

      Overall, I found the manuscript very easy to read and the analysis appears to be expertly performed. I have no substantive criticisms of the work and think it would be suitable for publication in its current form, or subject to some small edits.

      Minor comments

      The authors suggest that the weak 4th protomer in the HCMV UL52 3-mer map is a consequence of flexibility. This may be the case, but it may also be the case that the class is polluted with 4-mer particles leading to reduced occupancy. Erasing the weak density and running a multi-model 3D classification providing the erased 3-mer and a 4-mer starting map may separate these.

      I found the supplemental figure to show the DNA in the tripentamer map too small, this is an interesting finding and should be shown more clearly.

      Significance

      Herpesviruses are important pathogens of humans and are biologically complex systems. The structural analysis of this essential packaging co-factor is an important contribution to the field. It builds on the previous work by this group concerning the packaging factors of the gamma-herpesviruses KSHV and EBV. I consider this paper to be high-quality and worthy of publication in a very good journal with a microbiology/virology, biochemistry or molecular biology focussed readership. The process of genome packaging in herpesviruses is not as well characterised as in bacteriophages (and even in that case it is not well understood). This work provides important knowledge that will support future studies on this critical process in herpesvirus replication.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Responses to the reviewers

      First of all, we would like to thank all the reviewers for their valuable and constructive comments on our manuscript. We have considered each comment and revised the manuscript accordingly. We respond to each comment below in blue font.

      To Reviewer #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Iwase et al have used multiomics and spatial transcriptomics to comprehensively map neural crest cell contributions to the mouse heart and great arteries. This careful and detailed analysis reveals changes in the transcriptional profile of neural crest cells as they give rise to different regions and cell types in the heart and great vessels. The study significantly builds on a number of recent scRNA-seq analyses of neural crest cell development and includes development of a new informatic tool for regulatory network investigation. Among the new findings documented are downregulation of Hox gene expression in intracardiac crest cells and regulation of Sox9 by Meis transcription factors. Addressing the following points would improve clarity and accessibility. *

      Thank you for your encouraging feedback and comments. We have responded to your comments below.

      * In Figure 1C it is difficult to visualize all the colors given the mixed contribution of NCC and nonNCC cells to mesenchyme. Please also show YFP transcript distribution in NCC versus nonNCC plots. In addition, it would be helpful to show plots for both NCC and nonNCC for Gata4 and Tbx20. *

      To improve visualization, we separated EYFP-positive NCCs and EYFP-negative non-NCCs into distinct plots (Figure 1e), rather than displaying EYFP transcript distribution within a single combined plot. In addition, we have included separate plots for both NCCs and non-NCCs showing the expression of Gata4 and Tbx20 (Figure 1g, h).

       Furthermore, in the revised manuscript, we subdivided the original clusters c5 and c13 into two subclusters each, resulting in a total of 23 clusters in the UMAP shown in Figure 1. This refinement was introduced to facilitate clearer interpretation and subsequent analyses.
      

      * The authors identify a cardiomyocyte cell cluster in their integrated NCC scRNA-seq plots. Are these cells labelled by Wnt1-Cre in the authors' own dataset? Is the trajectory analysis informative as to the steps preceding acquisition of cardiomyocyte fate? *

      A total of 20 EYFP-positive NCCs in our own dataset were assigned to the cardiomyocyte cluster in the integrated UMAP. Of these, 6 cells were located within the cardiomyocyte cluster (c21), while the remaining cells were classified as pharyngeal mesenchyme but positioned in close proximity to c21 in the original UMAP (Figure 1e). Although this observation is consistent with previous reports showing that NCCs give rise to cardiomyocytes, the small number of cells precluded meaningful characterization or trajectory analysis of cardiomyocyte fate acquisition. Accordingly, we have addressed this point only briefly in the revised manuscript as follows:

      Only a few NCCs were detected within cardiomyocyte clusters, which were predominantly composed of non-NCCs, consistent with previous reports demonstrating NCC differentiation into cardiomyocytes15,16. The overall number of cardiomyocytes was low, likely reflecting the restricted sampling of the cardiac outflow tract (Figure 1a).

      (P6, L20–25)

      * Linked with this point, is it possible that there are nonNCC cells in the integrated plots? Of note, many of the NCC genes overlap with genes that have also been shown to be expressed in mesodermal cardiac progenitors (including Osr1, Pparg, Dlk1, Tcf21, Ebf2, Tbx20, Sox9). For example, is it possible to distinguish NCC derived smooth muscle within the heart from cells originating from the second heart field that may express smooth muscle genes? Cluster 27 for example appears broadly expressed in the region of ventricular outlets in Figure 3. Comparison with YFP transcript distribution may be helpful here. *

      In principle, non-NCCs were excluded from the integrated plots shown in Figure 3. However, we cannot completely rule out the possibility that a small number of non-NCC cells were inadvertently included, for example due to false-positive signals during cell isolation. In our dataset, NCCs and non-NCCs were stringently distinguished based on FACS profiles, detection of EYFP transcript reads in the RNA-seq data, and rigorous exclusion of doublets and low-quality cells.

       Regarding the distinction between NCC- and non-NCC-derived SMCs, a key challenge lies in defining comparable cell populations. We first validated the annotations of SMC clusters (C4, C23, and C27) using immunostaining for Myh11, Sost, and Reln, confirming consistency with their transcriptomic identities. We then re-clustered the SMC populations and projected non-NCC SMCs (clusters 4 and 20 in Figure 1d) onto this UMAP. These non-NCC SMCs were mapped to clusters corresponding to great artery and coronary artery SMCs (C27 and C23, respectively).
      
       However, we lack confidence that these projected populations are directly comparable. For example, non-NCC SMCs mapped to C23 or C27 may not necessarily represent bona fide coronary or great artery SMCs from equivalent anatomical regions, and could include other SMC subtypes such as venous SMCs or pericytes. Given the known regional heterogeneity of SMCs and the absence of strict spatial matching criteria, such comparisons would be difficult to interpret. This limitation is further compounded by the relatively small number of cells available.
      
       For these reasons, we focused on spatial validation of cluster annotations by immunostaining in this study, and have reserved detailed comparisons between NCC- and non-NCC-derived SMCs for future work. We believe this does not detract from the overall consistency or value of the present study.
      

      * Can the authors add any validation of key expression patterns, for example using fluorescent in situ hybridization? *

      Figures 2n-y present Xenium-based multiplexed fluorescent in situ hybridization data that validate the spatial expression patterns of marker genes characterizing NCC derivatives in pharyngeal mesenchyme, intracardiac mesenchyme, and SMC populations. In addition, we have incorporated new Xenium images highlighting key gene expression patterns in the aorticopulmonary (AP) septum at E12.5 (Figure S4), supporting the annotation of cluster C16 in the integrated UMAP as corresponding primarily to the AP septum. We have also added immunostaining data for Myh11, Sost, and Reln to further validate the annotations of SMC clusters (see the response above). Together, these data provide independent spatial confirmation of the transcriptional signatures identified in our single-cell analyses. Based on these data, we revised the relevant section of the Results as follows:

      The SMC clusters, which were continuous with the pharyngeal mesenchyme via transitional populations in the UMAP, were identified by high expression of the mature SMC marker Myh11 (Figure 3i). Differential gene expression analysis further distinguished individual clusters (Figure S4a-g). Among these, C27 displayed a transcriptomic profile characteristic of the great artery SMCs, including high expression of Sost (Figure S3j). C4 was enriched for Tfap2b and Ptger4 (Figure S3j), markers of the ductus arteriosus SMCs21,22, supporting its annotation. C0 and C7 likely represent transitional states between pharyngeal mesenchyme and differentiated lineages, potentially bifurcating toward great artery SMCs or cardiac cushion mesenchyme (Figure S3a and Table S6). C23 was characterized by high expression of Gja4, a marker of coronary artery SMCs, along with pericyte markers Kcnj8 and Rgs5 (Figure 3j and Figure S3k), corresponding to the cluster similarly annotated by Chen et al14. In addition, C23 was also distinguished from C4 and C27 by its expression of Reln (Figure S4d).

      • Immunostaining supported these cluster annotations. Sost expression is observed in great artery SMCs but not in ductus arteriosus and coronary artery SMCs, whereas Myh11 expression was higher in ductus arteriosus and coronary artery SMCs than in aortic SMCs (Figure S4f-m). Furthermore, Reln expression was restricted to coronary artery SMCs (Figure S4n-s).*

      (P8, L30 – P9, L14)

      * Please elaborate on the decoded Hox code patterns that appear to be indicative of arch origins. Do the results allow determination of whether the trajectories to different cardiac fates inferred in Figure 3D differ in different arches? *

      • *

      In response to the reviewer’s suggestion, we have further elaborated on the decoded Hox code patterns indicative of pharyngeal arch origin and examined whether trajectories toward distinct cardiac fates differ between arch-derived NCC populations (see new supplemental figure).

      To further delineate Hox code patterns associated with pharyngeal arch origin, we stratified the integrated UMAP by distinct Hox expression profiles (Figure S9). Cells expressing any Hox2 paralog, but lacking Hox3–5 paralogs, were defined as PA2-derived preotic NCCs, whereas cells expressing any of Hox3–5 paralogs were classified as PA3/4/6-derived postotic NCCs. Preotic, postotic, and Hox-negative populations were then projected onto the integrated UMAP across developmental stages (E10.5–E14.5). Trajectory inference indicated that transitions toward intracardiac mesenchyme occur earlier in preotic cells (E10.5) than in postotic cells (E11.5), consistent with their known sequential migration into the cardiac cushion8. From E12.5 to E14.5, postotic cells showed a progressive emergence of the aorticopulmonary septum–associated cluster C16 from transitional states. Notably, the proportion of Hox-negative cells increased within intracardiac mesenchyme, except in C16 where Hox expression was retained, supporting the notion that Hox genes are broadly downregulated in cushion-associated intracardiac NCCs (Figure 4k, S9).

      (P11, L24 – P12, L4)

      * The authors need to explain why the authors place an arrow from mesenchymal cluster 18 to 23 in Figure 3D while the trajectory analysis in 3C predicts the opposite direction. *

      RNA velocity analysis of scRNA-seq data is fundamentally based on splicing dynamics. The original framework assumes that transcriptional induction and repression persist long enough for cells to reach active (transcribing) or inactive (silenced) steady-state equilibrium. However, this assumption is often violated during cell differentiation, where transient cell populations frequently exhibit rapidly changing mRNA levels that do not reach steady-state equilibrium. To address this limitation, the scVelo method was developed (Bergen et al., 2020, Nature Biotechnology), and we applied this approach to the integrated NCC datasets in the present study. This analysis successfully inferred directional flows from the pharyngeal mesenchyme toward SMCs and intracardiac mesenchyme through transitional states (Figure 3c,d). However, as the reviewer correctly pointed out, the analysis predicted a directional flow from C23 to C18, apparently opposite to the biological directionality supported by previous findings (see P9, L15–19).

       We consider this discrepancy to reflect intrinsic limitations of RNA velocity analysis. As discussed by Bergen et al., in systems containing multiple lineages and cellular processes, differences in gene regulatory networks among heterogeneous cell states can generate multiple trajectories in phase space owing to distinct splicing kinetics. In addition, incompletely captured splicing kinetics may represent only a limited portion of the overall dynamics, particularly near terminal differentiation states. In such cases, phase portraits of unspliced versus spliced transcripts may appear nearly linear rather than curved, potentially leading to erroneous assignment of positive or negative RNA velocities. Consistent with this limitation, we obtained opposite directionalities between C18 and C23 depending on whether steady-state or dynamical models were applied and according to different parameter settings. Through these repeated computational re-evaluations of lineage directionality, we concluded that RNA velocity analysis is suitable for capturing the global landscape of differentiation flow, but that accurate inference of local lineage directionality may require careful model selection and parameter optimization to ensure consistency with established biological evidence.
      
       To avoid arbitrariness and potential confusion, we removed the arrow between C18 and C23 from the revised Figure 3d. Instead, we now describe the observed continuity between these populations in the Discussion section as follows:
      

      Notably, the UMAP revealed a continuum between C23 and C18 within the intracardiac mesenchyme population. Given previous findings that the proximal coronary artery SMCs originate from preotic NCCs8 and that pericytes give rise to coronary artery SMCs23, this connection likely represent a differentiation trajectory from intracardiac mesenchyme to coronary artery SMCs via a pericyte-like intermediate stage.

      (P9, L15–19)

      The continuity between intracardiac mesenchyme and coronary artery SMCs through a pericyte-like intermediate state is consistent with previous developmental studies showing that proximal coronary artery SMCs originate from preotic NCCs and may arise through pericyte intermediates8,21.

      (P18, L24-27)

      * The authors nicely show downregulation of Hox gene expression in NCC cells entering the heart. Can they add discussion of any insights into this from prior studies of loss or gain of Hox gene function? *

      • *

      We have added the following discussion on the roles of anterior Hox genes in cardiovascular development, together with appropriate references on loss or gain of Hox gene function:

      • The regional identities of pharyngeal NCCs that contribute to cardiac development are established by Hox genes and their associated regulatory networks. Genetic studies have demonstrated essential roles for the anterior Hox genes in patterning the pharyngeal arch artery system and semilunar valve structures. Loss of Hoxa1 and Hoxb1 results in severe defects in pharyngeal arch artery development42, whereas ectopic or sustained expression of Hoxb1 in NCCs disrupts cardiovascular morphogenesis and causes malformations of the great arteries and semilunar valves43. Likewise, Hoxa3 contributes to proper patterning of the pharyngeal arch region and its NCC-derived derivatives44,45. These findings underscore the importance of precise spatial and temporal regulation of Hox genes during cardiovascular development. Among pharyngeal NCCs contributing to cardiac development, cushion-independent NCC derivatives (great artery SMCs and the aorticopulmonary septum) retain their origin-specific Hox-codes. In contrast, cushion-associated NCC derivatives (coronary artery SMCs and valvular/subvalvular interstitial cells) downregulate Hox expression and transition toward region-specific GRNs involving TFs such as Tbx20 and Gata4, whose expression is known to be induced by BMP signaling in cardiomyocytes46,47. Bmp2 and Bmp4 are expressed in the regions of the pericardial reflection traversed by NCCs en route to the cardiac cushion48. Together, these observations suggest that appropriate repression of Hox programs, coupled with activation of cardiac-specific regulatory networks, is required for normal differentiation of cushion-associated NCC derivatives.*

      (P18, L30 – P19, L16)

      • Roux, M. et al. Hoxa1 and Hoxb1 are required for pharyngeal arch artery development. Mech. Dev. 143, 1–8 (2017).*
      • Zaffran, S., Odelin, G., Stefanovic, S., Lescroart, F. & Etchevers, H. C. Ectopic expression of Hoxb1 induces cardiac and craniofacial malformations. genesis 56, (2018).*
      • Chisaka, O. & Capecchi, M. R. Regionally restricted developmental defects resulting from targeted disruption of the mouse homeobox gene hox-1.5. Nature 350, 473–479 (1991).*
      • Kameda, Y., Watari-Goshima, N., Nishimaki, T. & Chisaka, O. Disruption of the Hoxa3 homeobox gene results in anomalies of the carotid artery system and the arterial baroreceptors. Cell Tissue Res. 311, 343–352 (2003).*

        In addition, we previously generated conditional Hoxa2 overexpression mice and demonstrated that ectopic Hoxa2 expression in Hox-negative PA1 cranial neural crest derivatives induced PA2-like structures, indicating a partial homeotic transformation (Kitazawa et al., Developmental Biology, 2015; 10.1016/j.ydbio.2015.04.007). Because cardiovascular phenotypes were not examined in that study, we have now resumed breeding of these mice for detailed cardiovascular phenotypic analyses.

        In parallel, we have also established knockout mice for a downstream target of Hoxa2, which are expected to complement the Hoxa2 gain-of-function model and provide further insight into the regulatory mechanisms underlying cardiac NCC differentiation and patterning. Although we have obtained preliminary observations from these models, a comprehensive analysis is still ongoing, and we therefore prefer to reserve these results for a future study with more detailed investigation.*

      • Figure 3Y could be simplified to more clearly distinguish the two types of Meis binding sites. For example, it may be helpful to reorder the mesenchymal cell types based on Hox expression. *

      To improve clarity and better distinguish the two types of Meis binding sites, we have reordered the heatmap of motif enrichment based on the hierarchical clustering with the updated JASPAR2024 database. In parallel, we have revised the heatmap of transcription factor gene expression to provide a more consistent and interpretable presentation. These diagrams were now presented as Figures 4k and 4l of the revised manuscript.

      * The authors provide nice in vitro and in vivo evidence for an upstream role of Meis transcription factors in regulating Sox9 expression. Can the authors identify from the enhancer sequence (or their transcriptomic dataset) any of the non-Hox transcription factors that Meis may be working with here? Please discuss the significance of Sox9 expression in epicardium driven by the same enhancer. Might this regulation also operate in second heart field progenitor cells where both genes are expressed? It is not evident in Figure 7 that Sox9-EGFP is also expressed in epicardium. *

      The distal Sox9 enhancer containing Meis2 binding site that we identified (chr11-112850240-112851186) also contains several consensus motifs including predicted Hand2- and Nfatc1-binding sites. However, to our knowledge, these transcription factors have not been reported as non-Hox partners of Meis proteins. Their recruitment to this enhancer, as well as potential cooperative interactions with Meis transcription factors, were not examined in the present study and remain subjects for future investigation.

       As the reviewer pointed out, Sox9 was expressed not only in intracardiac NCCs but also in the epicardium (revised Figure S13). Consistent with this expression pattern, distal *Sox9* enhancer was accessible in both intracardiac mesenchyme and epicardial cells (revised Figure S13). Especially, *Wt1*low/*Sox9*high mesenchymal cells, likely derived from the epicardium via epithelial-mesenchymal transition, also exhibit chromatin accessibility at this enhancer comparable to that observed in NCC-derived mesenchymal cells. These findings suggest that the same regulatory element may function across multiple cardiac lineages.
      
       We have addressed these points in the revised manuscript as follows, including additional supporting data in the supplementary figures.
      

      *Enhancer activity in the epicardium corresponds to Sox9 expression and an open chromatin peak at the putative distal enhancer region in clusters 22 and 5 in Figure 1d, which represent Wt1high epicardial cells and intracardiac mesenchyme likely including Wt1low epicardial EMT derivatives, respectively (Figure S13). *

      (P16, L28–32)

      • *

      • Could this approach yield similar data for Osr1? Please clarify if there is any experimental evidence supporting the predicted negative regulation of Sox9 by Osr1 in the heart illustrated in Figure 8. *

      There are currently no experimental data demonstrating Sox9 repression by Osr1 in the heart. However, such an effect has been reported in tongue and limb mesenchyme (Liu et al., PNAS, 2013), as noted in P16, L5–7. Although direct experimental validation, such as Osr1 overexpression in cardiopharyngeal NCCs, would provide stronger evidence, preparation of this specific NCC lineage is difficult. Given this limitation, we instead performed in silico gene perturbation analysis using CellOracle, which predicted antagonistic roles for Sox9 and Osr1 during lineage bifurcation from pharyngeal NCCs. Because this regulatory relationship has not yet been experimentally validated in the cardiac context, we revised the illustration accordingly by adding a question mark to indicate the hypothetical nature of this antagonism (Figure 9 in the revised version).

      * Concerning the links between valve mesenchyme and skeletogenic programs it would be relevant to cite the earlier work of Lincoln and Yutzey (reviewed in PMID: 16643886): *

      • *

      We cited the suggested work in the relevant portion of the Discussion section as follows:

      By analogy, Sox9high/Scxhigh NCCs at the base of semilunar valves may form a structural attachment unit linking cushion tissues to valvular leaflets57.

      (P20, L30–32)

      • Lincoln, J., Lange, A. W. & Yutzey, K. E. Hearts and bones: Shared regulatory mechanisms in heart valve, cartilage, tendon, and bone development. Dev. Biol. 294, 292–302 (2006).* *
      • In order to increase accessibility of the dataset the authors are encouraged to include a browser link. *

      We agree with the reviewer that improving dataset accessibility is important for facilitating exploration of NCC diversity. Therefore, we have already uploaded our original fastq sequence files and count matrices in the DDBJ and GEO servers. In addition, we will upload our gene expression datasets projected onto the UMAP in UCSC cell browser, enabling readers to more easily visualize and interrogate the dataset.

      Minor points:

      *13. The authors could rephrase the title since the term topographical genetic switch is unclear. *

      • *

      We thank the reviewer for this suggestion. To improve clarity, we revised the title as follows:

      Hox–Meis-relayed spatial gene regulatory transition underlies cardiopharyngeal neural crest diversification revealed by multimodal analysis * 14. In the introduction, with reference to the De Bono study, please note that Tbx1 was shown to regulate pharyngeal NCC differentiation stage transitions non-cell autonomously. *

      According to the reviewer’s suggestion, we rephrased the relevant section of the introduction as follows:

      De Bono et al. elaborated the transition of pharyngeal NCCs through multiple differentiating stages toward SMC fates, identifying Tbx2 and Tbx3 as key TFs in this process13. They also showed that Tbx1, the gene for 22q11.2 deletion syndrome, regulates pharyngeal NCC differentiation stage transitions non-cell autonomously13.

      (P5, L1–4)

      Reviewer #1 (Significance (Required)):

      *Iwase et al have used multiomics and spatial transcriptomics to comprehensively map neural crest cell contributions to the mouse heart and great arteries. This careful and detailed analysis reveals changes in the transcriptional profile of neural crest cells as they give rise to different regions and cell types in the heart and great vessels. The study significantly builds on a number of recent scRNA-seq analyses of neural crest cell development and includes development of a new informatic tool for regulatory network investigation. Among the new findings documented are downregulation of Hox gene expression in intracardiac crest cells and regulation of Sox9 by Meis transcription factors. *

      Again, thank you for giving us the opportunity to strengthen our manuscript with your valuable comments and queries. We have worked hard to incorporate your feedback and hope that these revisions persuade and satisfy you.

      To Reviewer #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary *In this manuscript, Iwase et al. cleverly make use of different modalities, spatial transcriptomics and single-cell omics datasets, in conjunction with a well-established Wnt1-Cre;R26R-EYFP line to trace neural crest cells (NCCs) contributing to the cardiovascular system during embryonic development in the mouse. By doing so, the authors identified a bifurcation between cardiac NCCs contributing to the OFT cushions and forming the aorticopulmonary (AP) septation complex. Thus, the authors split "intracardiac NCCs" into two different NCC programs/compartments, even though both reside in the broad OFT region. The NCCs that enter and associate with the OFT cushions undergo a Hox off transition (Hox-positive to Hox-negative once intracardiac), with a corresponding shift in Meis binding and GRN wiring. The authors propose these cells pass through a Meis2-Sox9-Scx "skeletogenic progenitor-like" intermediate and contribute to semilunar valves and coronary artery smooth muscle. By contrast, the NCCs assigned to aorticopulmonary septum (APS) formation and great vessel smooth muscle retain a distinct Hox codes. *

      Thank you for your encouraging feedback and comments. We have responded to your comments below.

      Major Comments *1. The manuscript would benefit from clearer delineation between the different NCC contributions, particularly for non-specialist readers. The distinction between (i) CNCCs in OFT cushions and (ii) CNCCs forming the aorticopulmonary (AP) septation complex is not adequately explained. While both populations contribute to OFT septation, according to the authors, they represent distinct compartments with different developmental trajectories. The authors could clarify this using anatomically labelled hearts at the stages they conduct their analysis, along with additional text and schematics explaining what is meant by each compartment. This would greatly enhance the accessibility of the manuscript. *

      To clarify the distinction between the two components of intracardiac NCCs, cushion-associated mesenchyme and cushion-independent aorticopulmonary (AP) septum, we substantially revised the description of cluster characterization (P9, L24–P10, L6). We added new figures (Figure S5) showing their spatial relationships and distinct gene expression signatures, including E12.5 Xenium data demonstrating Vegfc expression in cushion-associated NCCs and Tcf24 expression in AP septum NCCs.

       In addition, we revised the schematic diagram of intracardiac NCC distribution in Figure 4m and added Figure S9, which spatially delineates three major NCC trajectories with distinct Hox codes: (1) migration of preotic NCCs into the outflow tract cushions, (2) migration of postotic NCCs into the outflow tract cushions, and (3) protrusion of the AP septum from the dorsal wall of the aortic sac. We also incorporated additional explanatory text and cited relevant review articles on cardiac outflow tract development. We hope that these revisions substantially improve the clarity and accessibility of the manuscript, particularly for non-specialist readers.
      
      • The Methods state that "Decomposition of cell clusters of scRNA-seq was performed by RCTD to map them onto the Xenium dataset," but this description is insufficient. The authors should clarify whether RCTD was applied separately for each developmental stage (i.e., E11.5 scRNA-seq reference for E11.5 Xenium, E12.5 reference for E12.5 Xenium), or whether a pooled reference was used across stages. This clarification is important because RCTD performance depends critically on the correspondence between reference and target datasets. Using a multi-stage integrated reference to deconvolve stage-specific spatial data could introduce artifacts, as cell states and cluster compositions vary considerably across developmental timepoints. *

      The pooled scRNA-seq data of NCCs or non-NCCs at E11.5 and E12.5 were used for RCTD to decompose for the spatial allocation in the Xenium dataset. We agree with the reviewer that a multi-stage integrated reference may introduce artifacts. In fact, datasets at E11.5 and E12.5 were similarly distributed in UMAP space and exhibited similar transcriptomic signatures, whereas those at E14.5 and E17.5 demonstrated different characteristics in the integrated UMAP in Figure. 1c. Therefore, we used only E11.5 and E12.5 datasets for RCTD decomposition. We added the bellow sentence in Method section.

      The pooled scRNA-seq data of NCCs or non-NCCs at E11.5 and E12.5 were used for RCTD.

      (P33, L6)

      • *

      • Figure 2j-m needs annotations and schematics. It is currently very difficult to identify the different compartments. See Figure C in Chen et al. for an example of this approach. In addition, what are the fine clusters from 1 to 20? Which ones are NCC-derived? *

      We added anatomical annotations to the revised Figure 2a, b, j–m to facilitate identification of the different compartments. The fine clusters labeled 0–20 in the previous Figure 2j–m corresponded to the multiome clusters (NCC and non-NCC) shown in Figure 1d. In the revised manuscript, the same dataset was re-clustered into 23 clusters (0–22), which were subsequently used for decomposition analysis to predict cell compartments with maximum likelihood. We also revised the color scheme of the segmented cells in Figure 2j–m to improve visual distinction between compartments and facilitate interpretation of the spatial distribution patterns.

       Putative NCCs were identified through the following procedure:
      
      1. EYFP expression was estimated across 39 cell types in the Xenium dataset by integration with the single-cell multiome dataset (including both NCCs and non-NCCs) using Tangram.
      2. Xenium clusters enriched for EYFP expression, defined as clusters whose mean estimated EYFP level exceeded the threshold corresponding to the 65th percentile across all spots, and consistent with known neural crest derivatives were extracted as putative NCC populations.
      3. For each spot within these EYFP-enriched Xenium clusters, RCTD was used to estimate the corresponding multiome cluster identity. Thus, the diagrams in Figure 2j–m indicate the most likely multiome subpopulation assignment for each putative NCC spot, rather than categorizing the multiome clusters themselves as NCC- or non-NCC-derived. This description has been included in the Method section (P33, L6–14).

      * The panels show pharyngeal markers, OFT/intracardiac markers, and SMC markers in Figures 2n-y, but could the authors show the proportion of NCC-derived (YFP+) cells for each cluster? Could the authors also map only the YFP+ cells on the Xenium data? It would be useful to see the proportion of YFP-positive (NCC-derived) cells for each delineated compartment. YFP-positive cells appear to exist at the boundary between LV and RV in the septum, this observation would benefit from proper quantification. *

      Our Xenium analysis could not detect EYFP signals; therefore, as noted above, we estimated EYFP expression by integrating the scRNA-seq and Xenium datasets using Tangram. To clarify the relative enrichment of distribution for each Xenium cluster, we summarized the mean estimated EYFP expression as bar plots (Revised Figure S2q, r). Consistent with their established neural crest origin, Xenium clusters annotated as neuron (38), ganglion (32), and Schwann cell (17) showed high estimated EYFP expression. In addition, Xenium clusters 15 (SMC) and 25 (cushion mesenchyme) were also enriched for EYFP expression.

       Unfortunately, currently available algorithms for Xenium data analysis do not reliably allow visualization or extraction of gene expression profiles exclusively from putative EYFP-positive cells. Instead, we improved the visualization of the spatial distribution of putative EYFP-positive cells by replacing centroid-based signal display with segmentation-based rendering (Revised Figure 2f, g).
      
       As pointed out by the reviewer, putative YFP-positive cells appear to be present at the boundary between the LV and RV within the interventricular septum. However, the estimated EYFP signals in this region were substantially lower than those observed in well-established neural crest derivatives, suggesting that most of these signals likely represent background noise, although a minor population of sparsely distributed neural crest-derived cells cannot be excluded. More precise characterization of potential neural crest derivatives in the ventricular region will require future investigation.
      

      * Figure 3 is confusing because it integrates data from multiple overlapping stages (E8.5 to P7). While the authors identify distinct compartments, pharyngeal mesenchyme, intracardiac mesenchyme, and SMCs, it is unclear why stages beyond E14.5 and E17.5 (corresponding to the initial single-cell omics and Visium analysis) were included. Although leveraging additional datasets is a clever approach, the integration of data from such disparate developmental timepoints confounds interpretation. For example, Cluster 6 appears to include cells from both P7 and E12.5 stages. Given that the Visium data represent hearts at E14.5 and E17.5, it is problematic to map clusters derived from other stages onto these spatial datasets. *

      We integrate data from multiple overlapping stages (E8.5 to P7) in Figure 3 to capture diachronic cell identity and also stage-specific features, particularly within intracardiac mesenchyme and SMC populations. This approach enabled us to assess core lineage relationships, including trajectories linking pharyngeal mesenchyme to intracardiac mesenchyme and SMCs, each comprising multiple distinct subpopulations. These findings support the biological relevance of the integration as a framework for understanding lineage relationships across developmental time. However, we agree with the reviewer that it is problematic to map clusters derived from other stages onto these spatial datasets. Indeed, we restricted the spatial mapping analysis to cells from E14.5 and E17.5 within the integrated dataset, thereby ensuring consistency with the developmental stages represented in the Visium data. To address concerns about this issue, we have clarified this point in the Methods section by adding the underlined words in the following sentence.

      Spatial mapping of scRNA-seq data onto Visium sections was performed using only E14.5 and E17.5 datasets and the RCTD algorithm, as described above.

      (P 34, L12–13)

      • *

      • Throughout the manuscript, the authors describe "lineage relationships" between cell populations, but these are in fact developmental trajectories inferred computationally (via UMAP connectivity and RNA velocity), not true lineage relationships. This distinction is critical and should be explicitly stated. *

      • *

      We agree with the reviewer that the present “lineage relationships” described in the original manuscript were primarily inferred from computational analyses, including UMAP connectivity and RNA velocity, rather than being directly demonstrated by lineage-tracing experiments. We also acknowledge that some RNA velocity results were not fully consistent with known in vivo developmental trajectories.

       We strengthened the biological validation of the inferred trajectories by incorporating extensive spatial verification of gene expression using immunohistochemistry and Xenium in situ hybridization analyses to confirm the identity and localization of each cell population. We also refined the computational analyses to better resolve regional differences in NCC dynamics and added comprehensive schematic illustrations based on established models of heart development from previous studies, with appropriate citations throughout the manuscript.
      
       In the revised manuscript, we have added much spatial verification of gene expression by immunohistochemistry and Xenium in situ hybridization data to confirm the identity of each cell population, and also include proper citation in the appropriate context. We also improved computational analysis to clarify the regional difference in NCC dynamics with comprehensive schematic illustration based on heart developmental processes established by previous literatures. In response to the reviewer’s comment and these revisions, we have carefully rephrased the relevant descriptions to clarify that the observed relationships represent computationally inferred developmental trajectories rather than definitive lineage relationships as follows.
      

      (original) RNA velocity analysis in conjunction with developmental context, revealed lineage relationships among these groups (Figure 3c, d).

      (revised) RNA velocity analysis in conjunction with developmental context, inferred global lineage relationships among these groups (Figure 3c, d), consistent with developmental trajectories in vivo.

      (P8, L21–23)

      (original) …, we present a comprehensive map of cardiopharyngeal NCC lineages …

      (revised) …, we present a comprehensive map of cardiopharyngeal NCC populations …

      (P18, L1)

      (original) Overall, this study proposes a new framework for understanding cardiac NCCs heterogeneity based on the association with the cardiac cushion and the accompanying transition in Hox gene expression and regulatory programs. Our findings provide a basis for systematically dissecting the developmental diversity of cardiac NCCs.

      (revised) Overall, this study proposes a new framework for understanding cardiac NCCs heterogeneity based on developmental route, Hox-code retention, and region-specific regulatory programs. Importantly, the developmental relationships and differentiation pathways described here are inferred from integrated computational analyses, including transcriptomic similarity, UMAP connectivity, and RNA velocity, rather than direct lineage-tracing experiments. Within this framework, our findings suggest distinct differentiation trajectories leading to great artery, ductus arteriosus, and coronary artery SMCs, as well as the aorticopulmonary septum and valvular/subvalvular mesenchyme.

      (P20, L33 – P21, L7)

      We also rephrased additional relevant sections throughout the manuscript in accordance with the reviewer’s comment (Please see below).

      * For example, the statement "we propose that intracardiac NCCs within C10 and C22 differentiate via C2 into valvular (C21) and subvalvular (C18) interstitial cells" should be framed as a computational inference, not an established lineage relationship. Without clonal lineage tracing data, these claims cannot be verified. *

      • *

      We have changed the word “propose” to “infer” (P10, L24).*

      Similarly, the claim that "NCCs contributing to the AP septum are distinct from other intracardiac NCCs in that they do not populate the cardiac cushions but remain continuous with NCCs populating the distal outflow tract cushion" lacks direct lineage evidence. What experimental data support this assertion? *

      We agree with the reviewer that the original statement lacked direct lineage evidence, although the anatomical distinction between the outflow tract cushions and the AP septum—a protruding structure arising from the dorsal wall of the aortic sac—is well established. In the revised manuscript, we therefore removed this speculative statement and rewrote the section to more accurately describe the developmental process with appropriate references. The spatial and temporal features of AP septum formation are now also illustrated in Figure S9a (UMAP plots and schematic illustration) and described as follows:

      *From E12.5 to E14.5, postotic cells showed a progressive emergence of the aorticopulmonary septum–associated cluster C16 from transitional states. * (P11, L32 – P12, L1)

      In addition, we further validated the identity of the C16 cluster as AP septum–associated NCCs by incorporating additional marker analyses together with spatial verification using Xenium in situ analysis. In addition to Penk and Sfrp2, which were previously reported by Chen et al., C16 also highly expressed Postn, similar to other intracardiac clusters. Furthermore, C16 was characterized by relatively high expression of Tcf24 and low expression of Vegfc. Based on the reviewer’s comment and these additional experimental data, we revised the relevant Results section as follows:

      (original) Within the intracardiac mesenchyme group, C16 exhibited high expression of Penk and Sfrp2 (Figure S3i and Table S6), corresponding to the cluster annotated as the aorticopulmonary septum in the previous study by Chen et al14. This annotation was further supported by enriched expression of Hox4 and Hox5 paralogs, consistent with its origin between PA4 and PA6 (Figure 3t, u). The aorticopulmonary septum originates as a protrusion from the dorsal wall of the aortic sac and is primarily derived from NCCs6,18–20. This septal structure fuses with the distal outflow tract cushions to divide the common arterial trunk into the aortic and pulmonary channels. Notably, NCCs contributing to this septum are distinct from other intracardiac NCCs in that it does not populate the cardiac cushions but remain continuous with NCCs populating the distal outflow tract cushion, suggesting that C16 represents this distinct NCC-derived population.

      (revised) C16 was distinguished by high expression of Penk and Sfrp2 (Figure S3i and Table S6), corresponding to the cluster annotated as the aorticopulmonary septum in the previous study by Chen et al14. This cluster also exhibited robust expression of mesenchymal markers, including Postn, similar to other intracardiac clusters (Figure S3i). In addition, C16 showed relatively high expression of Tcf24 and low expression of Vegfc compared with the other intracardiac clusters (Figure S5a-e). These gene expression features of the aorticopulmonary septum were further validated by Xenium in situ hybridization (Figure S5f-j).

      • Unlike other intracardiac NCCs that populate the distal outflow tract cushions, the aorticopulmonary septum originates as a protrusion from the dorsal wall of the aortic sac and is primarily derived from NCCs residing in PA4 and PA66,22–24. This septal structure subsequently fuses with the distal outflow tract cushions to partition the common arterial trunk into the aortic and pulmonary channels. Consistent with this developmental origin, C16 was enriched for the expression of Hox4 and Hox5 paralogs (Figure 3t, u), indicating that NCCs in this population retain their Hox code, in contrast to other intracardiac NCCs, in which most Hox genes were downregulated (see later details).*

      (P9, L24 – P10, L6)

      * In addition, the authors state that what De Bono et al. identified as "outflow smooth muscle" corresponds in their dataset to early intracardiac mesenchymal clusters C2, C10, and C22, present as early as E10.5, when mesenchymal NCC derivatives express immature SMC markers, and that these cells later differentiate into coronary artery SMCs around E14.5 (C23) as well as other non-muscle components. This claim is not verified, whether these cells are indeed the ones differentiating into coronary artery SMCs is based solely on computational inference from C2, C10, C22 to C23. *

      • *

      To validate that C23 corresponds to coronary artery SMCs, we performed additional immunostaining analyses for Myh11, Sost, and Reln, which distinguish great artery SMCs, ductus arteriosus SMCs, and coronary artery SMCs, respectively, consistent with their transcriptomic identities. Regarding differentiation of NCC-derived mesenchymal cells into coronary artery SMCs, we previously demonstrated using chick–quail chimera experiments and specific Cre-reporter mouse lines that proximal coronary artery SMCs originate from preotic NCCs rather than postotic NCCs (Arima Y et al. Nature Communications 3:1267, 2012). In addition, coronary artery SMCs have been reported to differentiate through pericyte intermediates (Volz KS et al., eLife 4:1–22, 2015). Consistent with these findings, our present lineage-tracing analyses using Sox9-CreERT2; Ai14 and Scx-CreERT2; Ai14 mice demonstrated that progenies of Sox9high and Scxhigh intracardiac cushion mesenchymal cells contributed to the coronary artery SMCs as well as the surrounding mesenchyme.

       We agree with the reviewer that the relationship between the early intracardiac mesenchymal clusters (C2, C10, and C22) and C23 is primarily inferred from computational trajectory analyses and is not demonstrated by direct clonal lineage tracing. Accordingly, we revised the Discussion to avoid overstatement and to clarify that these lineage relationships are inferred based on computational analyses together with prior experimental findings and the additional validation data described above. The revised text is as follows:
      
      • *

      (original) Our integrated map incorporates previously published lineage analyses of cardiac NCCs at early and late stages13,14, providing continuity through complementary single-cell and spatial transcriptomic data, although our interpretation of certain clusters differs from those of prior studies. For example, the cell population identified by De Bono et al. as outflow smooth muscle13corresponds in our dataset to early intracardiac mesenchymal clusters C2, C10, and C22, present as early as E10.5, when mesenchymal NCC derivatives express immature SMC markers. These cells later differentiate into coronary artery SMCs around E14.5 (C23) as well as other non-muscle components. Despite such differences in interpretation, the integrated map robustly captures lineage relationships, supported by accumulated developmental and anatomical evidence regarding cardiac outflow tract formation, particularly in relation to the outflow tract cushion.

      (revised) Our integrated map incorporates previously published lineage analyses of cardiac NCCs at early and late stages13,14, providing continuity through complementary single-cell and spatial transcriptomic data. The present study further extends these datasets by resolving the heterogeneity of intracardiac mesenchymal populations and their lineage relationships. For example, the cell population identified by De Bono et al. as outflow smooth muscle13 corresponds in our dataset to early intracardiac mesenchymal clusters expressing immature SMC markers, which subsequently diverged into multiple derivatives including coronary artery SMCs. In addition, we identified distinct SMC populations corresponding to great artery SMCs, ductus arteriosus SMCs, and coronary artery SMCs, each characterized by unique molecular signatures such as Sost, Tfap2b/Ptger4, and Reln/Gja4, respectively. The continuity between intracardiac mesenchyme and coronary artery SMCs through a pericyte-like intermediate state is consistent with previous developmental studies showing that proximal coronary artery SMCs originate from preotic NCCs and may arise through pericyte intermediates8,21. Together, these findings provide a refined framework for understanding the diversification of cardiac NCC derivatives during outflow tract remodeling.

      (P18, L14–29)

      * Claims regarding marker expression in specific compartments (for example Hapln1 and Postn in cushions) require additional spatial validation at higher resolution than what is currently provided by the Visium data. Moreover, it is unclear whether these data are single-cell resolution; the authors need to clarify this. HCR staining would be ideal to confirm these expression patterns. Currently, all conclusions are based solely on gene expression without orthogonal spatial confirmation. At minimum, the authors should provide references from the literature supporting these expression patterns. *

      As reviewer suggested, the confirmation of spatial context for the gene expression patterns of scRNA-seqs data is important to validate. We further investigated the spatial expression patterns through Xenium in situ hybridization system. Among cardiac mesenchyme subpopulation, Postn was dominantly expressed, however, Tcf24 was specifically expressed in AP septum (C16) not in cushion (C2, 10, 18, 21 and 22) in scRNA-seq data. On the other hand, Vegfc was expressed except in the AP septum. To confirm these opposing expression patterns, we newly added Figure S5, showing Tcf24 and Vegfc expression revealed by Xenium.

      According to the reviewer’s suggestion, we added the sentence in the revised manuscript as follows:

      C16 was distinguished by high expression of Penk and Sfrp2 (Figure S3i and Table S6), corresponding to the cluster annotated as the aorticopulmonary septum in the previous study by Chen et al14. This cluster also exhibited robust expression of mesenchymal markers, including Postn, similar to other intracardiac clusters (Figure S3i). In addition, C16 showed relatively high expression of Tcf24 and low expression of Vegfc compared with the other intracardiac clusters (Figure S5a-e). These gene expression features of the aorticopulmonary septum were further validated by Xenium in situ hybridization (Figure S5f-j).

      (P9, L24–31)

      * Could the authors confirm the absence of the Sox9high/Scxhigh population in AP septum descendants? *

      Sox9high/Scxhigh NCCs are enriched not only in intracardiac NCC clusters C2, C10, and C14, but also in the AP septum-associated cluster C16, as stated in the manuscript as follows:

      In the integrated UMAP, Sox9high/Scxhigh NCCs were enriched in C2, C10, C14 and C16.

      (P17, L7–8)

      However, based on our previous finding that proximal coronary artery SMCs originate from preotic rather than postotic NCCs (Arima Y et al. Nature Communications 3:1267, 2012), we infer that the intermediate population contributing to coronary artery SMCs is more likely derived from intracardiac NCC clusters C2 and C10 than from the pharyngeal arch 4/6-derived AP septum-associated cluster C16. To clarify this interpretation, we have added the following statement to the final paragraph of the Results section:

      Together with our previous report that proximal coronary artery SMCs originate from preotic rather than postotic NCCs8, these results suggest that the intermediate population contributing to coronary artery SMCs likely represents a subset of Hox-downregulated intracardiac NCCs corresponding to clusters C2 and C10.

      (P17, L29–33)

      Minor Comments *Could the authors better justify their choice of stages (E11.5 to E17.5) for the single-cell multiomic assay? Given that OFT cushions are already populated by NCCs by E10.5 and that AP septum formation is already underway at this stage (see Development (2007) 134(8): 1593-1604), the rationale for beginning at E11.5 should be explicitly stated. *

      We agree that NCCs have already populated the OFT cushions and that AP septum formation is underway by E10.5. Our selection of stages from E11.5 to E17.5 was intended primarily to enable synchronous comparisons between pharyngeal and intracardiac NCC populations across developmental stages, rather than to capture the earliest spatiotemporal events of cardiopharyngeal NCC lineage establishment. We have clarified this rationale in the revised manuscript by revising the statement as follows:

      • *

      (original) To elucidate the spatiotemporal dynamics of cardiopharyngeal NCC lineages, we performed single-cell multiome analysis on pharyngeal and cardiac tissues from E11.5 and E12.5 and ...

      • *

      (revised) To compare pharyngeal and intracardiac NCCs across developmental stages and characterize their temporal changes, we performed single-cell multiome analysis on pharyngeal and cardiac tissues from E11.5 and E12.5 and ...

      (P6, L3-5)

      • *

      • *

      *Spelling errors in Figure 2e: "ambious" should be "ambiguous"; "atrium venticle" should be "atrium/ventricle"; "ventricle" is misspelled in several locations. The clustering resolution is very high, yielding many clusters that are difficult to distinguish based on the colour code alone. What does "ventricle_CL" refer to? What is the "marginal layer"? A clearer legend or supplementary table defining each cluster would be helpful. *

      To improve the clarity of the high-resolution clustering, we added Xenium cluster numbers to Figures 2c and 2d, enabling clusters to be distinguished not only by color but also by their anatomical locations. We also revised the cluster annotations in Figure 2e and corrected all spelling errors, including “ambiguous,”. In addition, we replaced “ventricle_CL” with “Cardiac muscle 1” and the formerly misspelled “atrium ventricle CM” with “Cardiac muscle 2.” The annotation “marginal layer” was also revised to “Spinal cord, marginal layer” for clarity.

      *Figure 2j: The image is very dim. *

      • *

      We have improved the image quality and added regional annotations to enhance visibility in the revised figure.

      * Figure 3b: What do the numbers correspond to? Additionally, "mesenchyme" is misspelled. *

      We have revised Figure 3b to clarify different characteristics of mesenchymal subpopulations. We have also corrected the misspelling of “mesenchyme.”

      **Referees cross-commenting**

      *I also agree with the other reviewers' comments. Many thanks. *

      Reviewer #2 (Significance (Required)):

      *The overall approach is sound and the datasets generated are valuable resources for the field. The manuscript presents interesting findings regarding NCC heterogeneity in cardiac development.

      As I understand the authors' model: OFT cushion-associated NCCs enter and associate with the OFT cushions, undergoing a Hox-positive to Hox-negative transition, accompanied by a shift in Meis binding and GRN rewiring. These cells pass through a Meis2-Sox9-Scx "skeletogenic progenitor-like" intermediate state and contribute to semilunar valves and coronary artery smooth muscle. Aorticopulmonary septum NCCs, by contrast, retain distinct Hox codes (Hox4/5 enrichment) rather than becoming Hox-negative, and contribute to the AP septum and great vessel smooth muscle.

      Strengths: The integration of multiple omics modalities with lineage tracing is a powerful approach, and the identification of a Hox-dependent bifurcation in NCC fate is a novel conceptual advance.

      Limitations: The reliance on computational trajectory inference without orthogonal lineage validation, combined with the integration of datasets spanning very different developmental stages, limits the strength of some conclusions. The analysis also required more precise anatomical annotations to facilitate accessibility to the readers - to visualise better the distinguishable contribution of the cardiac NCCs to the OFT.

      Advance The study extends knowledge in the field by providing novel mechanistic insight into neural crest diversification in the context of cardiovascular development. The nature of the advance is primarily mechanistic, identifying a Hox-Meis regulatory switch that distinguishes cushion-associated from cushion-independent NCC lineages.

      Audience This work will be of interest to a specialised audience interested in neural crest cells and developmental biologists using omics approaches to address cell fate diversification in complex tissues.

      Reviewer Expertise Developmental biology, lineage analysis, mouse genetics. I do not have the expertise to assess the computational methods used in this paper. *

      Again, thank you for giving us the opportunity to strengthen our manuscript with your valuable comments and queries. We have worked hard to incorporate your feedback and hope that these revisions persuade and satisfy you.

      To Reviewer #3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: *Iwase et al. presents a comprehensive multiomics analyses of cardiac neural crest cell (NCC) differentiation during cardiopharyngeal development. Using Wnt1Cre;R26R-EYFP mice, the authors isolated NCCs and non-NCCs at E11.5, E12.5, E14.5 and E17.5 stages and performed single-cell RNA-seq, ATAC-seq, spatial transcriptomics analyses. Spatial resolution of NCC-derived populations was achieved using Xenium (E11.5-E12.5) and Visium (E14.5-E17.5) platforms. Integration of single cell and spatial datasets identified distinct NCC-derived populations with defined spatial organization within the pharyngeal and intracardiac regions. The study concludes that Hox gene patterning underlies NCC subpopulation identity during cardiopharyngeal development and reveals a developmental transition from Hox-dependent to Hox-independent transcriptional regulation. Furthermore, the authors identify a Meis2-Sox9-Scx gene regulatory network making a skeletogenic progenitor-like intermediate that contribute to coronary smooth muscle and semilunar valve formation.

      While the dataset is comprehensive and technically strong, several key conclusions are not always convincingly supported by enough data. As a result, some claims appear speculative and would benefit from additional experimental validation to strengthen the proposed developmental models.

      I would strongly encourage authors to consider the following points to provide additional details that will strengthen their study: *

      Thank you for your encouraging feedback and comments. We have responded to your comments below.

      Major comments:

      *- The authors should provide detailed FACS gating strategies and sorting conditions used to selectively isolate EYP-positive and EYP-negative NCC populations, including representative plots and information on exclusion criteria (e.g., doublets, dead cells). *

      • *

      We added the sorting gates to the revised Figure S1 and described the detailed FACS gating strategy and sorting conditions in the revised Methods section as follows:

      • *

      (original) EYFP-positive and -negative single cells were sorted using a FACSAria II or FACSMelody (BD Biosciences), freshly processed or cryopreserved.

      • *

      (revised) Single-cell suspensions were stained with 7-AAD (BD Pharmingen) for 3 min at 4℃, and EYFP-positive and -negative single cells were sorted using a FACSAria II or FACSMelody (BD Biosciences). The sorting strategy was as follows:

      *Step 1. All events were gated by forward scatter (FSC) and side scatter (SSC) including area (A), height (H), and width (W) to obtain FSC singlets and remove doublets. *

      *Step 2. FSC singlets were gated for the 7-AAD negative fraction to isolate viable cells. *

      *Step 3. Viable cells were gated to isolate EYFP-positive NCCs or EYFP-negative non-NCCs. *

      Sorted cells were freshly processed or cryopreserved for the following procedure.

      (P28, L21–30)

      * - Although the authors isolated nuclei for both scRNA-seq and ATAC-seq, the number of cardiomyocytes within the EYFP-negative population is unexpectedly low. The authors should clarify potential technical or biological reasons for this underrepresentation (e.g., nuclei isolation efficiency, sorting strategy, filtering criteria, or developmental stage-specific composition). *

      • *

      The unexpectedly low proportion of cardiomyocytes within the EYFP-negative population likely reflects the restricted sampling region used in this study. Specifically, we dissected cardiopharyngeal tissue and the outflow tract region rather than the whole heart as shown in Figure 1a, which likely introduced a sampling bias that reduced the representation of cardiomyocytes in the dataset.

      Although the FACS gating strategy could potentially influence the recovery of specific cell types, we consider this possibility unlikely because cardiomyocyte populations were successfully detected in our dataset. In addition, during the quality-control process for scRNA-seq data, we applied a mitochondrial gene threshold of 25% to exclude low-quality cells. While mature cardiomyocytes typically exhibit high mitochondrial gene expression, embryonic cardiomyocytes at the analyzed developmental stages are immature and therefore were unlikely to be disproportionately excluded by this criterion. Consistent with this interpretation, distinct cardiomyocyte clusters remained detectable after filtering, indicating that cardiomyocytes were retained through the quality-control process.

      Based on these considerations, we conclude that the low abundance of EYFP-negative cardiomyocytes primarily reflects the limited anatomical region sampled. We have clarified this point in the revised Results section as follows:

      The overall proportion of cardiomyocytes was low, likely reflecting the restricted sampling of the cardiac outflow tract region (Figure 1a).

      (P6, L23–25)

      * - In figure 1, the authors present results from unsupervised clustering of 9,420 cells into 21 distinct clusters, many of which are broadly labeled as "mesenchymal cells". The authors should refine this nomenclature by providing more specific annotations or defining criteria, as this broad classification limits interpretability of the identified subpopulations. *

      • *

      In the revised manuscript, we further refined the clustering analysis by subdividing the original clusters C5 and C13 into two subclusters each, resulting in a total of 23 clusters in the UMAP shown in Figure 1d. This refinement improved the resolution and interpretability of the identified cell populations. In addition, we replaced the broad “mesenchymal cells” annotation with more specific classifications, including pharyngeal mesenchymal cells, intracardiac mesenchymal cells, and smooth muscle–like cells. To further clarify lineage relationships, we also provided UMAPs separately displaying NCC and non-NCC populations in Figure 1e, as suggested by the reviewer.

      *- To integrate spatial annotated Xenium datasets with scRNA-seq data, the authors used Tangram, enabling estimation of the spatial distribution of EYFP-positive NCCs within the pharyngeal region of E11.5 and 12.5 embryos. However, the E11.5 section show a relatively low number of EYFP-positive cells (Figure 2f). The authors should clarify whether this reflects technical limitations (e.g., probe design, segmentation efficiency, or integration parameters) or biological factors and explain how this affects interpretation of the spatial analyses. *

      *- The author used RCTD tool to decompose the scRNA-seq dataset into NCC and non-NCC components and mapped these onto the Xenium dataset. However, panels j and l in Figure 2 show low signal in the E11.5 sections. The authors should clarify whether this reflects technical limitations of the RCTD deconvolution, differences in sampling, or biological factors, and discuss how this result impacts of interpretation of the spatial mapping results at this stage. *

      • *

      In the original manuscript, estimated EYFP expression levels were visualized using a minimum cutoff of 0.1, with the remaining values mapped onto a 100-step color scale. However, this approach resulted in apparently weak signal intensity in the E11.5 sections because high-level noisy signals, including signals detected in the atrial lumen and outside the embryo, broadened the dynamic range of the visualization. In the revised manuscript, we applied an upper cutoff at the 90th percentile to reduce the influence of these noisy signals, resulting in improved visualization of EYFP-positive regions in Figure 2f, with signal intensity now comparable to that observed in Figure 2g. In addition, we further improved the spatial visualization of putative EYFP-positive cells by replacing centroid-based signal display with segmentation-based rendering in the revised Figure 2f, g. Importantly, these revisions affected only the visualization method and did not alter the underlying analyses or conclusions, as the integration and downstream analyses were performed using the original quantitative data.

       We also clarified the procedure used for NCC estimation and spatial mapping in the Methods section as follows:
      

      Putative NCCs were identified through the following procedure:

      Step 1. EYFP expression was estimated across 39 cell types in the Xenium dataset by integration with the single-cell multiome dataset (including both NCCs and non-NCCs) using Tangram.

      Step 2. Xenium clusters enriched for EYFP expression, defined as clusters whose mean estimated EYFP level exceeded the threshold corresponding to the 65th percentile across all spots, and consistent with known neural crest derivatives were extracted as putative NCC populations.

      Step 3. For each spot within these EYFP-enriched Xenium clusters, RCTD was used to estimate the corresponding multiome cluster identity.

      (P33, L7–16).

      *- The authors integrated their data with publicly available scRNA-seq datasets of NCCs from E8.5 to P7 hearts and present results from unsupervised clustering of 67,208 cells into 28 distinct clusters. Figures 3a and 3b show that cardiomyocyte (C26) is included in NCC-derivatives. The authors should clarify whether this reflects technical issue when they made FACS. *

      • *

      In our original datasets in Figure 1d, only a small number of NCCs were detected within the cardiomyocyte cluster (corresponding to C26 in Figure 3b), which was otherwise predominantly composed of non-NCCs. Cardiomyocytes assigned to C26 were also present in both publicly available scRNA-seq datasets included in the integrated analysis. Previous studies have reported that a limited subset of NCCs can differentiate into cardiomyocytes (Tomita, Y. et al., J. Cell Biol. 170:1135–1146, 2005; Tamura, Y. et al., Arterioscler. Thromb. Vasc. Biol. 31:582–589, 2011). Therefore, we consider that C26 likely represents a small population of NCC-derived cardiomyocytes rather than contamination caused by technical issues during FACS isolation, although the low cell number precluded further characterization.

       To clarify this point, we added the following statement to the Results section:
      
      • *

      *Only a few NCCs were detected within cardiomyocyte clusters, which were predominantly composed of non-NCCs, consistent with previous reports demonstrating NCC differentiation into cardiomyocytes15,16. *

      (P6, L20–23)

      * - The authors used RNA-velocity to infer relationship among the identified clusters. However, this analysis requires particular caution given that data were generated from multiple datasets obtained under different conditions. Several conclusions drawn from the RNA-velocity analysis are not convincing, as illustrated in Figures 3c and 3d, where the inferred velocity directions appear inconsistent with the proposed developmental model (e.g., trajectory from cluster 23 toward 18 or from 4 toward 6). The authors should clarify these discrepancies, justify the integration of heterogenous datasets and reassess the interpretation of the inferred lineage relationships. *

      In the default setting, the integration workflow provided by Seurat which is widely used for scRNA-seq analysis employs canonical correlation analysis (CCA). CCA effectively corrects batch effects across datasets generated from different experimental platforms. However, it sometimes causes overcorrection to attempt to forcibly integrate different cell populations that are not shared among datasets (Andreatta, Bioinformatics, 2021). To minimize overcorrection for multiple datasets obtained under different experimental conditions, we applied reciprocal principal component analysis (RPCA) method recommended for comparative integration of heterogeneous scRNA-seq datasets (Luecken et al., Nature Methods, 2021). This selection is suitable for the integration of multiple datasets provided by different independent studies as in case of the present study.

       To infer relationship among the identified clusters, we then used RNA velocity analysis of scRNA-seq data fundamentally based on splicing dynamics. The original framework assumes that transcriptional induction and repression persist long enough for cells to reach active (transcribing) or inactive (silenced) steady-state equilibrium. However, this assumption is often violated during cell differentiation, where transient cell populations frequently exhibit rapidly changing mRNA levels that do not reach steady-state equilibrium. To address this limitation, the scVelo method was developed (Bergen et al., 2020, *Nature Biotechnology*), and we applied this approach to the integrated NCC datasets in the present study. This analysis successfully inferred directional flows from the pharyngeal mesenchyme toward SMCs and intracardiac mesenchyme through transitional states (Figure 3c,d). However, as the reviewer correctly pointed out, the analysis predicted a directional flow from C23 to C18, apparently opposite to the biological directionality supported by previous findings (see P9, L15–19).
      
       We consider this discrepancy to reflect intrinsic limitations of RNA velocity analysis. As discussed by Bergen et al., in systems containing multiple lineages and cellular processes, differences in gene regulatory networks among heterogeneous cell states can generate multiple trajectories in phase space owing to distinct splicing kinetics. In addition, incompletely captured splicing kinetics may represent only a limited portion of the overall dynamics, particularly near terminal differentiation states. In such cases, phase portraits of unspliced versus spliced transcripts may appear nearly linear rather than curved, potentially leading to erroneous assignment of positive or negative RNA velocities. Consistent with this limitation, we obtained opposite directionalities between C18 and C23 depending on whether steady-state or dynamical models were applied and according to different parameter settings. Through these repeated computational re-evaluations of lineage directionality, we concluded that RNA velocity analysis is suitable for capturing the global landscape of differentiation flow, but that accurate inference of local lineage directionality may require careful model selection and parameter optimization to ensure consistency with established biological evidence.
      
       To avoid arbitrariness and potential confusion, we removed the arrow between C18 and C23 from the revised Figure 3d. Instead, we now describe the observed continuity between these populations in the Results and Discussion sections as follows:
      

      Notably, the UMAP revealed a continuum between C23 and C18 within the intracardiac mesenchyme population. Given previous findings that the proximal coronary artery SMCs originate from preotic NCCs8 and that pericytes give rise to coronary artery SMCs23, this connection likely represent a differentiation trajectory from intracardiac mesenchyme to coronary artery SMCs via a pericyte-like intermediate stage.

      (P9, L15–19)

      The continuity between intracardiac mesenchyme and coronary artery SMCs through a pericyte-like intermediate state is consistent with previous developmental studies showing that proximal coronary artery SMCs originate from preotic NCCs and may arise through pericyte intermediates8,21.

      (P18, L24-27)

      *- The authors should provide more detail on how they identified bifurcation points and more clearly explain the transition from intracardiac mesenchyme to smooth muscle cells (SMC). Additionally, they should clarify what distinguishes the three clusters (C4, C23, C27) in terms of transcription programs, marker expression, or functional states, to better support their proposed differentiation trajectories. *

      To clarify the distinctions among the three SMC clusters (C4, C23, C27), we added a heatmap showing differentially expressed genes, violin plots for the mature SMC marker Myh11, and feature plots with immunostaining images for Myh11, Sost, and Reln expression (Figure S4). These additional analyses further validate the molecular and spatial characteristics of the three SMC clusters. Based on these data, we revised the relevant section of the Results as follows:

      The SMC clusters, which were continuous with the pharyngeal mesenchyme via transitional populations in the UMAP, were identified by high expression of the mature SMC marker Myh11 (Figure 3i). Differential gene expression analysis further distinguished individual clusters (Figure S4a-g). Among these, C27 displayed a transcriptomic profile characteristic of the great artery SMCs, including high expression of Sost (Figure S3j). C4 was enriched for Tfap2b and Ptger4 (Figure S3j), markers of the ductus arteriosus SMCs21,22, supporting its annotation. C0 and C7 likely represent transitional states between pharyngeal mesenchyme and differentiated lineages, potentially bifurcating toward great artery SMCs or cardiac cushion mesenchyme (Figure S3a and Table S6). C23 was characterized by high expression of Gja4, a marker of coronary artery SMCs, along with pericyte markers Kcnj8 and Rgs5 (Figure 3j and Figure S3k), corresponding to the cluster similarly annotated by Chen et al14. In addition, C23 was also distinguished from C4 and C27 by its expression of Reln (Figure S4d).

      • Immunostaining supported these cluster annotations. Sost expression is observed in great artery SMCs but not in ductus arteriosus and coronary artery SMCs, whereas Myh11 expression was higher in ductus arteriosus and coronary artery SMCs than in aortic SMCs (Figure S4f-m). Furthermore, Reln expression was restricted to coronary artery SMCs (Figure S4n-s).*

      (P8, L30 – P9, L14)

      To further clarify the identification of bifurcation points and the transition from intracardiac mesenchyme to SMCs, we additionally stratified the integrated UMAP according to distinct Hox expression profiles and inferred lineage trajectories corresponding to different neural crest and pharyngeal arch origins (Figure S9). Based on the inferred differentiation trajectory from C18 to C23 (P9, L15–19), together with the identification of C2 and C10 as Sox9high and Scxhigh intracardiac cushion mesenchymal populations contributing to coronary artery SMCs (P17, L27–33), we incorporated these lineage relationships into the schematic model presented in Figure 9.

      Minor comments:

      *- The authors convincingly demonstrate a switch in Meis-binding motifs across NCC populations, supporting a model in which cardiac cushion-associated NCCs transition from Hox-dependent to Hox-independent transcriptional regulation via alternative cofactor interactions and DNA-binding preferences. However, the authors should provide evident on whether GATA motifs are enriched within Meis peaks, as this could further clarify potential cooperative interactions during this transition. *

      Although GATA-binding motifs were enriched within Meis-associated open chromatin regions in intracardiac NCCs compared with many other motifs, a substantial proportion of GATA motifs were located in peaks distinct from those containing Meis motifs. This observation raises the possibility that GATA and Meis transcription factors may cooperate through interactions across separate regulatory elements to modulate enhancer activity. However, we did not directly investigate this possibility in the present study. Instead, we found that the Meis-associated peaks identified in intracardiac NCCs, including the distal Sox9 enhancer containing a Meis2-binding site (chr11:112850240–112851186), more prominently contained several other consensus motifs, including predicted Hand2- and Nfatc1-binding sites. To our knowledge, however, these transcription factors have not previously been described as non-Hox cofactors of Meis proteins. Their potential recruitment to this enhancer, as well as possible cooperative interactions with Meis transcription factors during intracardiac NCC differentiation, was not examined in the current study and remains an important subject for future investigation.

      * - In Figure 5 panels g, j and k are difficult to interpret. The authors should provide clearer annotations, labeling, or additional explanations to improve readability and facilitate understanding of the data. *

      • *

      We added the annotations to UMAP in Figure 6h (and 6i) corresponding to Figure 4a and included color bars in Figure 6k as well as in 6i. To further improve readability and facilitate understanding of the data, we added the explanation of the perturbation scores in the legends for Figure 6i and k.

      (original)

      *(g) Pseudotime trajectory analysis of integrated NCC clusters inferred using CellOracle. *

      *(h-k) Sox9 (h, i) and Osr1 (j, k) knockout simulation presented as altered differentiation vector flows (h, j) and perturbation scores which was inner product of perturb simulation (i, k). *

      • *

      (revised)

      *(g) Pseudotime developmental flow of integrated NCC clusters from the neural tube, inferred using CellOracle and projected onto the UMAP space shown in Figure 4a. *

      *(h-k) Sox9 (h, i) and Osr1 (j, k) knockout simulations presented as altered differentiation vector flows (h, j) and perturbation scores, defined as the inner product between the simulated perturbation vectors and the original developmental flow (i, k). Green and magenta color bars indicate normal developmental flow and reverse flow induced by perturbation of the indicated genes, respectively. *

      (P41, L3–L9)

      *- In Figure 6, results support the role of hexameric Meis-binding motif-containing region as a distal enhancer of Sox9. The authors should provide additional results from a ChIP-qPCR experiment to further validate this model. *

      • *

      We attempted ChIP-seq experiments on O7-1 neural crest cell line using two different anti-Meis antibodies. However, we were unable to detect specific binding of Meis proteins to this enhancer region, although the luciferase assays clearly demonstrated the enhancer activity that was significantly attenuated by deletion of the Meis-binding motif. This discrepancy may reflect differences between endogenous chromatin and plasmid-based reporter contexts, including epigenetic modifications and chromatin accessibility. We are now investigating experimental conditions that would allow direct verification of endogenous Meis binding to this region.

      * - Panel l in Figure S3 requires better annotation. *

      • *

      We added annotations including the aorta, pulmonary valve, left coronary artery, and its septal branch.

      * - Correct the typo errors in Figure 5a. *

      • *

      The typographical errors “consercvation” and “Visuzalization” were corrected to “conservation” and “Visualization”, respectively.

      * - The authors should refer to previous studies showing the role of Hoxa1 and Hoxb1 in the development of great arteries or semilunar valves. *

      We have added the following discussion on the roles of anterior Hox genes in cardiovascular development, together with appropriate references:

      • The regional identities of pharyngeal NCCs that contribute to cardiac development are established by Hox genes and their associated regulatory networks. Genetic studies have demonstrated essential roles for the anterior Hox genes in patterning the pharyngeal arch artery system and semilunar valve structures. Loss of Hoxa1 and Hoxb1 results in severe defects in pharyngeal arch artery development42, whereas ectopic or sustained expression of Hoxb1 in NCCs disrupts cardiovascular morphogenesis and causes malformations of the great arteries and semilunar valves43. Likewise, Hoxa3 contributes to proper patterning of the pharyngeal arch region and its NCC-derived derivatives44,45. These findings underscore the importance of precise spatial and temporal regulation of Hox genes during cardiovascular development. Among pharyngeal NCCs contributing to cardiac development, cushion-independent NCC derivatives (great artery SMCs and the aorticopulmonary septum) retain their origin-specific Hox-codes. In contrast, cushion-associated NCC derivatives (coronary artery SMCs and valvular/subvalvular interstitial cells) downregulate Hox expression and transition toward region-specific GRNs involving TFs such as Tbx20 and Gata4, whose expression is known to be induced by BMP signaling in cardiomyocytes46,47. Bmp2 and Bmp4 are expressed in the regions of the pericardial reflection traversed by NCCs en route to the cardiac cushion48. Together, these observations suggest that appropriate repression of Hox programs, coupled with activation of cardiac-specific regulatory networks, is required for normal differentiation of cushion-associated NCC derivatives.*

      (P18, L30 – P19, L16)

      • Roux, M. et al. Hoxa1 and Hoxb1 are required for pharyngeal arch artery development. Mech. Dev. 143, 1–8 (2017).*
      • Zaffran, S., Odelin, G., Stefanovic, S., Lescroart, F. & Etchevers, H. C. Ectopic expression of Hoxb1 induces cardiac and craniofacial malformations. genesis 56, (2018).*
      • Chisaka, O. & Capecchi, M. R. Regionally restricted developmental defects resulting from targeted disruption of the mouse homeobox gene hox-1.5. Nature 350, 473–479 (1991).*
      • Kameda, Y., Watari-Goshima, N., Nishimaki, T. & Chisaka, O. Disruption of the Hoxa3 homeobox gene results in anomalies of the carotid artery system and the arterial baroreceptors. Cell Tissue Res. 311, 343–352 (2003).*

      **Referees cross-commenting**

      *Having read the comments of the other reviewers, I totally agree with them. All our comments converge and should allow the authors to improve their manuscript. *

      Reviewer #3 (Significance (Required)):

      *The study provides high-resolution spatial and temporal mapping of NCC-derived populations and proposes mechanistic insights into Hox-dependent versus Hox-independent transcriptional regulation, as well as a Meis2-Sox9-Scx gene regulatory network contributing to smooth muscle and semilunar valve formation.

      Strengths and limitations: The datasets are rich and well-integrated, offering valuable resources for the field. However, several key conclusions rely on correlative analyses and heterogeneous datasets, making some claims speculative. Technical details, such as FACS gating, low representation of cardiomyocytes, and interpretation of RNA velocity, require further clarification, which currently limits the strength of the mechanistic inferences.

      Advance: This work advances the understanding of NCC lineage diversification and gene regulatory dynamics in cardiopharyngeal development, particularly highlighting potential transcriptional switches and intermediate progenitor states that guide structural formation in the heart.

      Audience: The study will be of interest to researchers in developmental biology, cardiovascular biology, and single-cell multi-omics, particularly those studying neural crest cell differentiation and cardiac morphogenesis.*

      Again, thank you for giving us the opportunity to strengthen our manuscript with your valuable comments and queries. We have worked hard to incorporate your feedback and hope that these revisions persuade and satisfy you.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Iwase et al. presents a comprehensive multiomics analyses of cardiac neural crest cell (NCC) differentiation during cardiopharyngeal development. Using Wnt1Cre;R26R-EYFP mice, the authors isolated NCCs and non-NCCs at E11.5, E12.5, E14.5 and E17.5 stages and performed single-cell RNA-seq, ATAC-seq, spatial transcriptomics analyses. Spatial resolution of NCC-derived populations was achieved using Xenium (E11.5-E12.5) and Visium (E14.5-E17.5) platforms. Integration of single cell and spatial datasets identified distinc NCC-derived populations with defined spatial organization within the pharyngeal and intracardiac regions. The study concludes that Hox gene patterning underlies NCC subpopulation identity during cardiopharyngeal development and reveals a developmental transition from Hox-dependent to Hox-independent transcriptional regulation. Furthermore, the authors identify a Meis2-Sox9-Scx gene regulatory network making a skeletogenic progenitor-like intermediate that contribute to coronary smooth muscle and semilunar valve formation.

      While the dataset is comprehensive and technically strong, several key conclusions are not always convincingly supported by enough data. As a result, some claims appear speculative and would benefit from additional experimental validation to strengthen the proposed developmental models.

      I would strongly encourage authors to consider the following points to provide additional details that will strengthen their study:

      Major comments:

      • The authors should provide detailed FACS gating strategies and sorting conditions used to selectively isolate EYP-positive and EYP-negative NCC populations, including representative plots and information on exclusion criteria (e.g., doublets, dead cells).
      • Although the authors isolated nuclei for both scRNA-seq and ATAC-seq, the number of cardiomyocytes within the EYPF-negative population is unexpectedly low. The authors should clarify potential technical or biological reasons for this underrepresentation (e.g., nuclei isolation efficiency, sorting strategy, filtering criteria, or developmental stage-specific composition).
      • In figure 1, the authors present results from unsupervised clustering of 9,420 cells into 21 distinct clusters, many of which are broadly labeled as "mesenchymal cells". The authors should refine this nomenclature by providing more specific annotations or defining criteria, as this broad classification limits interpretability of the identified subpopulations.
      • To integrate spatial annotated Xenium datasets with scRNA-seq data, the authors used Tangram, enabling estimation of the spatial distribution of EYFP-positive NCCs within the pharyngeal region of E11.5 and 12.5 embryos. However, the E11.5 section show a relatively low number of EYFP-positive cells (Figure 2f). The authors should clarify whether this reflects technical limitations (e.g., probe design, segmentation efficiency, or integration parameters) or biological factors and explain how this affects interpretation of the spatial analyses.
      • The author used RCTD tool to decompose the scRNA-seq dataset into NCC and non-NCC components and mapped these onto the Xenium dataset. However, panels j and l in Figure 2 show low signal in the E11.5 sections. The authors should clarify whether this reflects technical limitations of the RCTD deconvolution, differences in sampling, or biological factors, and discuss how this result impacts of interpretation of the spatial mapping results at this stage.
      • The authors integrated their data with publicly available scRNAs-eq datasets of NCCs from E8.5 to P7 hearts and present results from unsupervised clustering of 67,208 cells into 28 distinct clusters. Figures 3a and 3b show that cardiomyocyte (C26) is included in NCC-derivatives. The authors should clarify whether this reflects technical issue when they made FACS.
      • The authors used RNA-velocity to infer relationship among the identified clusters. However, this analysis requires particular caution given that data were generated from multiple datasets obtained under different conditions. Several conclusions drawn from the RNA-velocity analysis are not convincing, as illustrated in Figures 3c and 3d, where the inferred velocity directions appear inconsistent with the proposed developmental model (e.g., trajectory from cluster 23 toward 18 or from 4 toward 6). The authors should clarify these discrepancies, justify the integration of heterogenous datasets and reassess the interpretation of the inferred lineage relationships.
      • The authors should provide more detail on how they identified bifurcation points and more clearly explain the transition from intracardiac mesenchyme to smooth muscle cells (SMC). Additionally, they should clarify what distinguishes the three clusters (C4, C23, C27) in terms of transcription programs, marker expression, or functional states, to better support their proposed differentiation trajectories.

      Minor comments:

      • The authors convincingly demonstrate a switch in Meis-binding motifs across NCC populations, supporting a model in which cardiac cushion-associated NCCs transition from Hox-dependent to Hox-independent transcriptional regulation via alternative cofactor interactions and DNA-binding preferences. However, the authors should provide evident on whether GATA motifs are enriched within Meis peaks, as this could further clarify potential cooperative interactions during this transition.
      • In Figure 5 panels g, j and k are difficult to interpret. The authors should provide clearer annotations, labeling, or additional explanations to improve readability and facilitate understanding of the data.
      • In Figure 6, results support the role of hexameric Meis-binding motif-containing region as a distal enhancer of Sox9. The authors should provide additional results from a ChiP-qPCR experiment to further validate this model.
      • Panel l in Figure S3 requires better annotation.
      • Correct the typo errors in Figure 5a.
      • The authors should refer to previous studies showing the role of Hoxa1 and Hoxb1 in the development of great arteries or semilunar valves.

      Referees cross-commenting

      Having read the comments of the other reviewers, I totally agree with them. All our comments converge and should allow the authors to improve their manuscript.

      Significance

      The study provides high-resolution spatial and temporal mapping of NCC-derived populations and proposes mechanistic insights into Hox-dependent versus Hox-independent transcriptional regulation, as well as a Meis2-Sox9-Scx gene regulatory network contributing to smooth muscle and semilunar valve formation.

      Strengths and limitations: The datasets are rich and well-integrated, offering valuable resources for the field. However, several key conclusions rely on correlative analyses and heterogeneous datasets, making some claims speculative. Technical details, such as FACS gating, low representation of cardiomyocytes, and interpretation of RNA velocity, require further clarification, which currently limits the strength of the mechanistic inferences.

      Advance: This work advances the understanding of NCC lineage diversification and gene regulatory dynamics in cardiopharyngeal development, particularly highlighting potential transcriptional switches and intermediate progenitor states that guide structural formation in the heart.

      Audience: The study will be of interest to researchers in developmental biology, cardiovascular biology, and single-cell multi-omics, particularly those studying neural crest cell differentiation and cardiac morphogenesis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, Iwase et al. cleverly make use of different modalities, spatial transcriptomics and single-cell omics datasets, in conjunction with a well-established Wnt1-Cre;R26R-EYFP line to trace neural crest cells (NCCs) contributing to the cardiovascular system during embryonic development in the mouse. By doing so, the authors identified a bifurcation between cardiac NCCs contributing to the OFT cushions and forming the aorticopulmonary (AP) septation complex. Thus, the authors split "intracardiac NCCs" into two different NCC programs/compartments, even though both reside in the broad OFT region. The NCCs that enter and associate with the OFT cushions undergo a Hox off transition (Hox-positive to Hox-negative once intracardiac), with a corresponding shift in Meis binding and GRN wiring. The authors propose these cells pass through a Meis2-Sox9-Scx "skeletogenic progenitor-like" intermediate and contribute to semilunar valves and coronary artery smooth muscle. By contrast, the NCCs assigned to aorticopulmonary septum (APS) formation and great vessel smooth muscle retain a distinct Hox codes.

      Major Comments

      1. The manuscript would benefit from clearer delineation between the different NCC contributions, particularly for non-specialist readers. The distinction between (i) CNCCs in OFT cushions and (ii) CNCCs forming the aorticopulmonary (AP) septation complex is not adequately explained. While both populations contribute to OFT septation, according to the authors, they represent distinct compartments with different developmental trajectories. The authors could clarify this using anatomically labelled hearts at the stages they conduct their analysis, along with additional text and schematics explaining what is meant by each compartment. This would greatly enhance the accessibility of the manuscript.
      2. The Methods state that "Decomposition of cell clusters of scRNA-seq was performed by RCTD to map them onto the Xenium dataset," but this description is insufficient. The authors should clarify whether RCTD was applied separately for each developmental stage (i.e., E11.5 scRNA-seq reference for E11.5 Xenium, E12.5 reference for E12.5 Xenium), or whether a pooled reference was used across stages. This clarification is important because RCTD performance depends critically on the correspondence between reference and target datasets. Using a multi-stage integrated reference to deconvolve stage-specific spatial data could introduce artifacts, as cell states and cluster compositions vary considerably across developmental timepoints.
      3. Figure 2j-m needs annotations and shcematics. It is currently very difficult to identify the different compartments. See Figure C in Chen et al. for an example of this approach. In addition, what are the fine clusters from 1 to 20? Which ones are NCC-derived?
      4. The panels show pharyngeal markers, OFT/intracardiac markers, and SMC markers in Figures 2n-y, but could the authors show the proportion of NCC-derived (YFP+) cells for each cluster? Could the authors also map only the YFP+ cells on the Xenium data? It would be useful to see the proportion of YFP-positive (NCC-derived) cells for each delineated compartment. YFP-positive cells appear to exist at the boundary between LV and RV in the septum, this observation would benefit from proper quantification.
      5. Figure 3 is confusing because it integrates data from multiple overlapping stages (E8.5 to P7). While the authors identify distinct compartments, pharyngeal mesenchyme, intracardiac mesenchyme, and SMCs, it is unclear why stages beyond E14.5 and E17.5 (corresponding to the initial single-cell omics and Visium analysis) were included. Although leveraging additional datasets is a clever approach, the integration of data from such disparate developmental timepoints confounds interpretation. For example, Cluster 6 appears to include cells from both P7 and E12.5 stages. Given that the Visium data represent hearts at E14.5 and E17.5, it is problematic to map clusters derived from other stages onto these spatial datasets.
      6. Throughout the manuscript, the authors describe "lineage relationships" between cell populations, but these are in fact developmental trajectories inferred computationally (via UMAP connectivity and RNA velocity), not true lineage relationships. This distinction is critical and should be explicitly stated.

      For example, the statement "we propose that intracardiac NCCs within C10 and C22 differentiate via C2 into valvular (C21) and subvalvular (C18) interstitial cells" should be framed as a computational inference, not an established lineage relationship. Without clonal lineage tracing data, these claims cannot be verified.

      Similarly, the claim that "NCCs contributing to the AP septum are distinct from other intracardiac NCCs in that they do not populate the cardiac cushions but remain continuous with NCCs populating the distal outflow tract cushion" lacks direct lineage evidence. What experimental data support this assertion?

      In addition, the authors state that what De Bono et al. identified as "outflow smooth muscle" corresponds in their dataset to early intracardiac mesenchymal clusters C2, C10, and C22, present as early as E10.5, when mesenchymal NCC derivatives express immature SMC markers, and that these cells later differentiate into coronary artery SMCs around E14.5 (C23) as well as other non-muscle components. This claim is not verified, whether these cells are indeed the ones differentiating into coronary artery SMCs is based solely on computational inference from C2, C10, C22 to C23. 7. Claims regarding marker expression in specific compartments (for exmaple Hapln1 and Postn in cushions) require additional spatial validation at higher resolution than what is currently provided by the Visium data. Moreover, it is unclear whether these data are single-cell resolution; the authors need to clarify this. HCR staining would be ideal to confirm these expression patterns. Currently, all conclusions are based solely on gene expression without orthogonal spatial confirmation. At minimum, the authors should provide references from the literature supporting these expression patterns. 8. Could the authors confirm the absence of the Sox9high/Scxhigh population in AP septum descendants?

      Minor Comments

      Could the authors better justify their choice of stages (E11.5 to E17.5) for the single-cell multiomic assay? Given that OFT cushions are already populated by NCCs by E10.5 and that AP septum formation is already underway at this stage (see Development (2007) 134(8): 1593-1604), the rationale for beginning at E11.5 should be explicitly stated. Spelling errors in Figure 2e: "ambious" should be "ambiguous"; "atrium venticle" should be "atrium/ventricle"; "ventricle" is misspelled in several locations. The clustering resolution is very high, yielding many clusters that are difficult to distinguish based on the colour code alone. What does "ventricle_CL" refer to? What is the "marginal layer"? A clearer legend or supplementary table defining each cluster would be helpful. Figure 2j: The image is very dim. Figure 3b: What do the numbers correspond to? Additionally, "mesenchyme" is misspelled.

      Referees cross-commenting

      I also agree with the other reviewers' comments. Many thanks.

      Significance

      The overall approach is sound and the datasets generated are valuable resources for the field. The manuscript presents interesting findings regarding NCC heterogeneity in cardiac development.

      As I understand the authors' model: OFT cushion-associated NCCs enter and associate with the OFT cushions, undergoing a Hox-positive to Hox-negative transition, accompanied by a shift in Meis binding and GRN rewiring. These cells pass through a Meis2-Sox9-Scx "skeletogenic progenitor-like" intermediate state and contribute to semilunar valves and coronary artery smooth muscle. Aorticopulmonary septum NCCs, by contrast, retain distinct Hox codes (Hox4/5 enrichment) rather than becoming Hox-negative, and contribute to the AP septum and great vessel smooth muscle.

      Strengths: The integration of multiple omics modalities with lineage tracing is a powerful approach, and the identification of a Hox-dependent bifurcation in NCC fate is a novel conceptual advance.

      Limitations: The reliance on computational trajectory inference without orthogonal lineage validation, combined with the integration of datasets spanning very different developmental stages, limits the strength of some conclusions. The analysis also required more precise anatomical annotations to facilitate accessibility to the readers - to visualise better the distinguishable contribution of the cardiac NCCs to the OFT.

      Advance

      The study extends knowledge in the field by providing novel mechanistic insight into neural crest diversification in the context of cardiovascular development. The nature of the advance is primarily mechanistic, identifying a Hox-Meis regulatory switch that distinguishes cushion-associated from cushion-independent NCC lineages.

      Audience

      This work will be of interest to a specialised audience interested in neural crest cells and developmental biologists using omics approaches to address cell fate diversification in complex tissues.

      Reviewer Expertise

      Developmental biology, lineage analysis, mouse genetics. I do not have the expertise to assess the computational methods used in this paper.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Iwase et al have used multiomics and spatial transcriptomics to comprehensively map neural crest cell contributions to the mouse heart and great arteries. This careful and detailed analysis reveals changes in the transcriptional profile of neural crest cells as they give rise to different regions and cell types in the heart and great vessels. The study significantly builds on a number of recent scRNA-seq analyses of neural crest cell development and includes development of a new informatic tool for regulatory network investigation. Among the new findings documented are downregulation of Hox gene expression in intracardiac crest cells and regulation of Sox9 by Meis transcription factors. Addressing the following points would improve clarity and accessibility.

      1. In Figure 1C it is difficult to visualize all the colors given the mixed contribution of NCC and nonNCC cells to mesenchyme. Please also show YFP transcript distribution in NCC versus nonNCC plots. In addition, it would be helpful to show plots for both NCC and nonNCC for Gata4 and Tbx20.
      2. The authors identify a cardiomyocyte cell cluster in their integrated NCC scRNA-seq plots. Are these cells labelled by Wnt1-Cre in the authors' own dataset? Is the trajectory analysis informative as to the steps preceding acquisition of cardiomyocyte fate?
      3. Linked with this point, is it possible that there are nonNCC cells in the integrated plots? Of note, many of the NCC genes overlap with genes that have also been shown to be expressed in mesodermal cardiac progenitors (including Osr1, Pparg, Dlk1, Tcf21, Ebf2, Tbx20, Sox9). For example, is it possible to distinguish NCC derived smooth muscle within the heart from cells originating from the second heart field that may express smooth muscle genes? Cluster 27 for example appears broadly expressed in the region of ventricular outlets in Figure 3. Comparison with YFP transcript distribution may be helpful here.
      4. Can the authors add any validation of key expression patterns, for example using fluorescent in situ hybridization?
      5. Please elaborate on the decoded Hox code patterns that appear to be indicative of arch origins. Do the results allow determination of whether the trajectories to different cardiac fates inferred in Figure 3D differ in different arches?
      6. The authors need to explain why the authors place an arrow from mesenchymal cluster 18 to 23 in Figure 3D while the trajectory analysis in 3C predicts the opposite direction.
      7. The authors nicely show downregulation of Hox gene expression in NCC cells entering the heart. Can they add discussion of any insights into this from prior studies of loss or gain of Hox gene function?
      8. Figure 3Y could be simplified to more clearly distinguish the two types of Meis binding sites. For example, it may be helpful to reorder the mesenchymal cell types based on Hox expression.
      9. The authors provide nice in vitro and in vivo evidence for an upstream role of Meis transcription factors in regulating Sox9 expression. Can the authors identify from the enhancer sequence (or their transcriptomic dataset) any of the non-Hox transcription factors that Meis may be working with here? Please discuss the significance of Sox9 expression in epicardium driven by the same enhancer. Might this regulation also operate in second heart field progenitor cells where both genes are expressed? It is not evident in Figure 7 that Sox9-EGFP is also expressed in epicardium.
      10. Could this approach yield similar data for Osr1? Please clarify if there is any experimental evidence supporting the predicted negative regulation of Sox9 by Osr1 in the heart illustrated in Figure 8.
      11. Concerning the links between valve mesenchyme and skeletogenic programs it would be relevant to cite the earlier work of Lincoln and Yutzey (reviewed in PMID: 16643886):
      12. In order to increase accesibility of the dataset the authors are encouraged to include a browser link.

      Minor points:

      1. The authors could rephrase the title since the term topographical genetic switch is unclear.
      2. In the introduction, with reference to the De Bono study, please note that Tbx1 was shown to regulate pharyngeal NCC differentiation stage transitions non-cell autonomously.

      Significance

      Iwase et al have used multiomics and spatial transcriptomics to comprehensively map neural crest cell contributions to the mouse heart and great arteries. This careful and detailed analysis reveals changes in the transcriptional profile of neural crest cells as they give rise to different regions and cell types in the heart and great vessels. The study significantly builds on a number of recent scRNA-seq analyses of neural crest cell development and includes development of a new informatic tool for regulatory network investigation. Among the new findings documented are downregulation of Hox gene expression in intracardiac crest cells and regulation of Sox9 by Meis transcription factors.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all reviewers for the valuable feedback and critical insight on our study. We acknowledge the concern that the manuscript, in its initial form, appeared descriptive and did not provide the mechanistic insight inferred from the current data. In the revised manuscript, we will (i) more clearly delineate what mechanistic inferences can be drawn from the existing data, (ii) expand our discussion of the caspase-independent mechanisms, and (iii) incorporate additional experiments/analyses aimed at identifying downstream effectors that mediate the observed phenotypes. In this revision plan, we have included six new figures addressing some of the major issues raised by reviewers.

      1. Specifically, to address questions about mechanistic insight, we generated stable ACSL1:HaloTag expressing hESCs. Currently presented as Figure 1A for reviewers____. __ACSL1 is a critical enzyme that catalyzes the first step of fatty acid oxidation at the outer mitochondrial membrane. Our previous analysis and work from the Opferman lab demonstrated that ACSL1 contains a BH3-like domain. Thus, we examined the effects of MCL-1 inhibition on the mitochondrial localization of this enzyme. Our findings pinpoint that MCL-1 inhibition is causing the displacement of ACSL1 from the mitochondria (__Figures 1B-C for reviewers). Our interpretations of the effects of MCL-1 inhibition are 2-fold: 1) as we show in our data, MCL-1 inhibition causes disruption of the mitochondrial cristae, altering the microenvironment for fatty acid oxidation, and 2) as seen in cancer cells, the MCL-1 inhibitor may also displace ACSL1 from the mitochondria. In the new version of the manuscript, we will focus on these 2 mechanisms as mechanistic outcomes of MCL-1 inhibition.
      2. We have included data of cells treated with Perhexilin (CPT1/2 inhibitor), and Etomoxir (CPT1a inhibitor) (Figure 2 for reviewers). This experiment determines whether direct perturbation the FAO pathway mimics the effects of the MCL-1i.
      3. We have assayed the effects of MCL-1 inhibition on oxygen consumption rates in NPCs. Currently presented as Figure 3 for reviewers.
      4. We will perform MCL-1:MICOS proximity ligation assays and/or immunoprecipitation assays to determine whether MCL-1 inhibitors disrupt the association of MCL-1 with MICOS. Preliminary data suggesting an association (albeit, very weak) are shown in Figure 4 for reviewers. __Reviewer #1____ (Evidence, reproducibility and clarity (Required)): __

      Summary: This study claims that beyond its canonical anti-apoptotic function, MCL-1 has essential non-apoptotic roles in human neurodevelopment. Pharmacologic inhibition of MCL-1 in human neural stem cells disrupts mitochondrial inner membrane architecture by destabilizing cristae and the OPA1-MICOS complex, leading to swollen mitochondria with disorganized cristae. These structural defects impair fatty acid oxidation and lipid droplet homeostasis, linking cristae integrity to metabolic competence. Independently of apoptosis or proliferation, MCL-1 inhibition selectively depletes intermediate neural progenitors, indicating a direct role in lineage progression. Overall, the work positions MCL-1 as a key regulator of mitochondrial structure-metabolism coupling that instructs neural progenitor identity and human neurogenesis.

      Overall: The study does a good job of using (in most assays) caspase inhibition (e.g., QVD treatment) to block apoptotic responses induced by MCL-1 inhibition. As a result, many of the phenotypes caused by inhibition are likely to be independent of caspase activation. As a result, this manuscript would be of interest to researchers that study the topics of the BCL-2 family and cell death signaling, mitochondrial bioenergetics and dynamics, neurodevelopment, and cellular metabolism. However, as currently presented the manuscript is only descriptive and lacks mechanistic insight.

      We thank Reviewer 1 for the insightful evaluation of our work. We are encouraged that the reviewer finds the study relevant to investigators in the fields of BCL-2 family biology, mitochondrial dynamics and bioenergetics, neurodevelopment, and cellular metabolism. We also thank the reviewer for pointing out the need to increase the mechanistic insight of our findings. As mentioned above, in the revised manuscript, we are proposing to address this.

      Major Concerns:

      1) The authors only use a single MCL-1 inhibitor and never use other non-targeting BH3-mimetics (such as venetoclax) as negative controls. This seems like a missed opportunity to demonstrate that the phenotypes observed are MCL-1 dependent.

      This is an excellent point. We will include venetoclax (ABT-199) to examine their effect on intermediate progenitors (TBR2 +) and early born neurons (BIII tubulin +).

      2) There is no mechanism proposed in this study other than reliance upon QVD as not affecting the phenotypes. As submitted, the manuscript only can speculate that these phenotypes are due to non-apoptotic roles of MCL-1 inhibition. The authors have missed an opportunity to explore MCL-1's non-apoptotic functions directly.

      Mechanistically, we propose MCL-1 is acting in 2 ways: 1) as we show in our data, MCL-1 inhibition causes disruption of the mitochondrial cristae, altering the microenvironment for fatty acid oxidation, and 2) as seen in cancer cells, MCL-1 inhibitors may also displace ACSL1 from the mitochondria.

      In the past few weeks, since receiving the initial reviews, we have focused on testing the 2nd possibility, since the accumulation of lipids was also seen in cancer cells (see PMID: 38503284). We have successfully generated stable ACSL1:HaloTag expressing hESCs (Figure 1A for reviewers). Our findings included here, ACSL1 is displaced from the mitochondria by MCL-1 inhibition in NPCs (Figures 1B-C for reviewers).

      Other concerns exist that weaken the impact of the study.

      1. Figure 1 should include the fact that QVD inhibition (shown in Sup Fig 2) does not obviate the phenotype induced by pharmacological inhibition of MCL-1 on mitochondrial morphology. We would like to clarify that QVD does prevent the phenotypes induced by MCL-1 inhibition on mitochondrial morphology. In Fig1B, we report an increase in volume and surface area at 24h and 48h along with a decrease in mitochondrial content at 48h when NPCs were treated with MCL-1i only. However, NPCs co-treated with QVD in Supp Fig 2B did not exhibit any significant morphological phenotypes on average or at min/max values. Reviewer 1 may be referring to Fig 1B's corresponding min/max values presented in Supp Fig 2A where we reported an increase in __max __volume.

      Figure #

      Volume

      Surface Area

      Fig 1B (MCL-1i only, avg values)

      Increase (avg vol)

      increase (avg)

      Supp Fig 2B (MCL-1i+QVD)

      no change

      no change

      Supp Fig 2A (MCL-1i only, max/min values)

      increase (max vol)

      no change (max)

      For clarity, we will move Supplementary Fig 2A into Supplementary Fig 1.

      Figure 2 would benefit from evidence that caspase inhibition does not repress the phenotype on mitochondrial cristae morphology (volume and area). Furthermore, the FIB-SEM data are very hard to appreciate as the size precludes visualization of individual mitochondria.

      While we included the visualization of the segmented mitochondria and cristae (Figure 2C), as well as snapshots through the z-stack for segmented cristae only (Figure 2E) and segmented mitochondria separately (Supp Figure 3A) in the original manuscript, we are also now attaching the FIB-SEM 3D reconstruction videos (New Supplementary Videos 1-2 for reviewers) (1. Mito and cristae, 2. Cristae only, 3. Mito only) for ease of visualization purposes.

      Figure 3 reports that MIC60 and OPA1 appear to be downregulated in response to MCL-1 inhibition, but these appear to be more significant only when QVD is added. Why would the phenotype be obscured in the non-QVD setting (Fig. 2B&C). How does MCL-1 inhibition lead to changes in MIC60/MICOS/OPA1? This seems quite preliminary at this point.

      In Figures 3B and 3C, we report decreased protein levels of short-form OPA1 and MIC10 only, not MIC60. We argue that our data with QVD shows that the cell death function of MCL-1 (i.e., inhibiting cell death effectors from initiating the caspase cascade) is not the main trigger of the phenotypes we report (cristae dysregulation and fatty acid oxidation disruption), however, cells without a functional cristae and/or defects in FAO, may not be able to survive long-term. Thus, QVD treatment preserves these cells that may not survive the dismantling of such an essential structure. To confirm this, we have performed immunofluorescence of cleaved caspase 3 (Figure 5 for reviewers). These results show that indeed MCL-1 inhibition at the time points of our study doesn't result in increased activation of Caspase-3. We reported similar results of MCL-1 inhibition in oligodendrocyte precursor cells (Gil and Hanna et al., Glia, 2025, PMID: 41420072)

      The loss of MIC60 and OPA1 should repress electron transport chain function, are such impacts observed in the cultured cells? This could be shown by assessing oxygen consumption, etc. Such data would enhance the authors' conclusion that MCL-1 inhibition leads to defects in mitochondrial physiology*. *

      We completely agree with this comment by Reviewer 1. In our revision, we will include an assessment of mitochondrial oxygen consumption rate, using the Seahorse analyzer (mitochondrial stress test), of NPCs treated with MCL-1i. Preliminary data (n=3) are currently presented as Figure 3 for reviewers. Interestingly, these data show a more nuanced cellular response. Consistent with our conclusion that MCL-1 inhibition does not cause apoptotic cell death, MCL-1i did not affect mitochondrial respiration at baseline. The specific deficits appear in spare respiratory capacity and maximal respiration, meaning cells can sustain routine mitochondrial function but lose the ability to respond to increased energetic demand. This suggests MCL-1 loss creates a mitochondrial reserve deficiency rather than a generalized bioenergetic failure. The results with caspase inhibitors show a near-zero OCR across both 24h and 48h timepoints, and significant reductions in maximal respiration, spare respiratory capacity, and non-mitochondrial OCR. Remarkably, these conditions are not detrimental to newborn neurons, as shown in Figure 7. This is very interesting because it suggests that, under severe bioenergetic failure, neural stem cells (PAX6+) can differentiate into newborn neurons in a TBR2-independent manner. More relevant to this study, our results unequivocally demonstrate that TBR2-positive cells depend on the non-apoptotic function of MCL-1

      In Figure 4, the differences between transcripts (qPCR data) and protein (immunoblot) data are often confusing and not well explained. Why do the authors propose that mRNA expression is decreasing whereas the protein expression is increasing? Example CPT1. Furthermore, it is unclear what these data mean functionally? Is this reflective of enhanced lipid oxidation or simply a response to inhibition of fatty acid oxidation? Clarification of the impact of these findings is necessary.

      We agree with Reviewer 1 that the results could be hard to interpret. However, the effects of MCL-1 inhibitors on the transcription of fatty acid oxidation genes have been widely cited by the work of Opferman and Walensky (PMID: 36198266). We speculate that the effects on transcription are triggered by mitochondrial signaling. The mechanistic insight into this phenomenon would be an interesting next step.

      In the case of CPT1, we addressed this comment and found that the difference is due to differential expression of isoforms The RT-qPCR shown in Figure 4, is on CPT1c, whereas the western blot is on CPT1a. Unfortunately, after trying several products, we determined that there are no good antibodies for CPT1c. Thus, since we can't compare gene and protein expression, we will include CPT1a RT-qPCR data to complement the western blot.

      The increase in lipid droplet number induced by MCL-1 inhibition has been previously documented, but it is unclear whether this increase is related to an inability to oxidize lipid (defective fatty acid oxidation) that leads to increases in the cellular abundance or whether this indicates that MCL-1 inhibition leads to enhanced storage. Do other inhibitors of fatty acid oxidation lead to similar increases in lipid droplet size and abundance? Does QVD inhibition affect this phenotype?

      This is a great point raised by Reviewer 1, and one we have also wondered about. We conducted an experiment using C16 BODIPY to address this point (Figure 6 for Reviewers). We observed no changes in C16 lipid droplet accumulation in count, volume, or surface area when cells were treated with MCL-1 inhibitor for 24 hours total with or without a starvation period in the last 6 hours of treatment. However, we observed significant pan-lipid droplet accumulation in the same conditions. This contrast suggests that FAO of exogenous LC-fatty acids is not reliant on MCL-1. This finding does not discount from the requirement of MCL-1 for other FAO processes especially given the major limitation of how much C16 BODIPY (fluorescent palmitate) can be administered to the cells (10µM) which was 10-fold less than what we exogenously supplied to the cells for the pan-BODIPY experiment (100µM, see Figure 5). It is entirely possible that this small dose was not enough to detect any lipid droplet accumulation.

      We have now also included experiments using etomoxir and perhexiline to assess their effects on TBR2/PAX6 (Figure 2 for reviewers). The results indicate that inhibiting the FAO pathway does not fully mimic the effects of MCL-1i on TBR2. However, we show that MCL-1i displaces ACSL1 from the mitochondria, a step that is upstream of CPT1/2. We suggest a model in which the coordinated non-apoptotic function of MCL-1 at the outer mitochondrial membrane promotes ACSL1 activity and, in the inner mitochondrial membrane, regulates mitochondrial cristae morphology. While our data point to this model, we are limited by the tools to investigate it further, but it will be a great direction for future experiments.

      For Figure 6, while these data may be very meaningful, as presented they are very hard to appreciate. Insets that show the neuronal populations would help to convey the point that the differentiation is impacted. Also, are there other methods that could confirm these observations (qPCR to show changes in differentiation).

      We agree with Reviewer 1. In the new version of the manuscript, we will include panels that zoom into the cell populations we quantified. The current panels will go to a new Supplemental figure. We will also add the TUBB3 to the qPCR panel in the new version.

      Figure 7 is also very hard to appreciate. What is the reader to see? Can these be quantified? It seems that QVD may be rescuing in this figure, does this suggest that MCL-1 inhibition might be inducing death. All of this needs to be quantified.

      We will provide quantification of BIII tubulin branching, and it will be included next to the images provided.

      BCL-XL has also been implicated in affecting mitochondrial electron transport chain function (See PMID: 19255249, 21926988, 21987637). Can BCL-XL inhibitors affect any of the phenotypes associated here?

      We will include experiments to test the effect of BCL-2 and BCL-XL inhibitors on TBR2 cells to address this comment.

      Please be carefully avoid using the term "MCL-1 loss", when talking about pharmacological inhibition. Only genetic ablation (e.g. knockout, silencing, etc.) should be termed loss.

      We have now removed the reference to MCL-1 loss in line 199.

      __*Reviewer #1 (Significance (Required)):

      The study advances in human cells the impacts of MCL-1 inhibition. They replicate many impacts previously observed in mouse systems and refine analyses to impacts on MICOS complex, lipid droplet storage, and neuronal differentiation. While these findings are important and would be well received by a wide audience, the study fails to provide almost any mechanistic insight into how these phenotypes are being induced. The only common theme is that blocking caspase activation in many assays fails to block the phenotype.

      *__

      __Reviewer #2_ (Evidence, reproducibility and clarity (Required)): _*

      Summary: This manuscript by Hanna et al. investigates non-apoptotic roles of MCL-1 in human neural stem cells and connects MCL-1 inhibition to mitochondrial cristae formation and beta-oxidation. Connecting these roles to brain development, the authors also show a reduction in the number of progenitor cells upon MCL-1 inhibition, independently of caspase activity. Throughout their work, the authors make use of an impressive array of imaging techniques. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. *__

      We thank Reviewer 2 for the thoughtful and positive assessment of our manuscript. We appreciate the reviewer's recognition that our study reveals non-apoptotic roles of MCL-1 in human neural stem cells. We are also grateful for the acknowledgment of the imaging approaches employed, which allowed us to connect MCL-1 function to cristae architecture with multiple complementary techniques. We acknowledge the reviewer's point that the mechanistic basis by which MCL-1 influences cristae structure remains insufficiently defined. In the revised manuscript, we will clarify the limitations of the current data, expand our discussion of potential mechanisms, and incorporate additional analyses to identify downstream effectors that mediate these structural and metabolic changes.

      Major comments:

      - In Fig. 1B, the very same representative images are shown for both conditions (DMSO and S63845) at 48 hours.

      We deeply appreciate Reviewer 2 for catching this unintentional duplication that occurred during figure preparation. We have now corrected this issue.

      - For Western Blot analysis, it looks like the authors only quantified the band density of their proteins of interest without considering varying levels of control protein (Actin) levels. Normalizing the protein levels to actin would account for any differences in loaded protein amounts (although a Ponceau staining might be preferable still to exclude this). This is especially relevant for Fig. 4E, where actin levels visibly differ between the conditions.

      All WB quantifications were normalized to Actin (this detail is now added to the y-axis of all band density graphs and figure legends). In addition, we will transform the data to a logarithmic scale to "normalize" for gel-to-gel variability.

      - The authors offer evidence that MCL-1 inhibition impedes proteolytic cleavage of OPA1-L into the OPA-1-S isoforms, yet do not explore the mechanism behind this. Since OPA1 is cleaved by both OMA1 and YME1L, determination of the levels of these proteases could help shed some light on the mechanism leading to cristae reorganization.

      We will follow up on Reviewer 2's comment with a WB analysis of OMA1 and YMEL in cells treated with an MCL-1 inhibitor.

      - Generally speaking, while the authors show all those effects (cristae defects, FAO dysfunction) upon MCL-1 inhibition, it would be interesting to see whether any of those effects can be rescued by blocking FA import e.g. through carnitine palmitoyl- transferase 1a (CPT1a) inhibition with etomoxir to understand if they are downstream of altered Fa supply. This could affect cristae morphology through altered Cardiolipin biogenesis.

      This is an excellent point, which was also raised by reviewer 1. We have now included experiments using etomoxir and perhexiline to assess their effects on TBR2/PAX6 (Figure 2 for Reviewers). As mentioned above, the results indicate that inhibiting the FAO pathway does not fully mimic the effects of MCL-1i on TBR2. However, we show that MCL-1i displaces ACSL1 from the mitochondria, a step that is upstream of CPT1 and 2. We suggest a model in which the coordinated non-apoptotic function of MCL-1 at the outer mitochondrial membrane promotes ACSL1 activity and, in the inner mitochondrial membrane, regulates mitochondrial cristae morphology. While our data point to this model, we are limited by the tools to investigate it further, but it will be a great direction for future experiments. The suggestion of Reviewer 2 that the effects on FAO could impact cardiolipin biogenesis is a very exciting possibility. However, difficult to test with the tools available.

      - In line 262 the authors discuss that mitochondria lose metabolic function upon MCL-1 inhibition. This claim would require additional experiments. While the authors look at lipid droplet accumulation and FAO enzymes, there are many more aspects to mitochondrial metabolic function that should be investigated. While measuring the oxygen consumption rate via Seahorse might require additional resources (optional), measurements of ATP production, ROS generation or determination of the mitochondrial membrane potential should be feasible.

      We fully agree with Reviewer 2's comment, which was also raised by Reviewer 1. In our revision, we will include an assessment of the mitochondrial oxygen consumption rate of NPCs treated with MCL-1i, measured using the Seahorse analyzer (mitochondrial stress test). These data are presented as Figure 3 for reviewers. Interestingly, these data show a more nuanced cellular response. While MCL-1i does not globally collapse mitochondrial respiration at baseline, the specific deficits appear in spare respiratory capacity and maximal respiration, meaning cells can sustain routine mitochondrial function but lose the ability to respond to increased energetic demand. This suggests MCL-1 loss creates a mitochondrial reserve deficiency rather than a generalized bioenergetic failure. The results with caspase inhibitors show a near-zero OCR across both 24h and 48h timepoints, and significant reductions in maximal respiration, spare respiratory capacity, and non-mitochondrial OCR. These conditions are detrimental for TBR2-positive NPCs (Figure 6) , but not for newborn neurons (Figure 7).

      - While the authors "propose a model in which MCL-1 associates with MICOS", they do not offer direct scientific to support this hypothesis. Co-immunoprecipitation experiments or e.g. proximity ligation assays would better support the proposed model.

      We agree with this statement. Preliminary, we have performed proximity ligation assays and immunoprecipitation analyses to test for this interaction (see below and ____Figure 4 for reviewers), and the results indicate an interaction, albeit very weak. In the revised version of the manuscript, we will attempt to repeat these experiments with MCL-1i.

      - While Fig. 7 shows representative images, quantification e.g. for the truncation of neuronal processes is missing.

      We will provide quantification of BIII tubulin branching, which will be included alongside the images provided.

      - In lines 219f. the authors state that they "observed a significant downregulation of PAX6 and EOMES at 24 hours that was not rescued by QVD co-treatment". While there is still a trend towards a downregulation, there is no statistical significance anymore. In fact, PAX6 levels almost mirror those of SOX2 which is not described as "downregulated" by the authors. In order to be more consistent, I would suggest rephrasing this part, or at least reword it to be less absolute.

      In the new version, we will clarify that while QVD rescued TBR2 and PAX6 transcript levels at 24h, it did not rescue them at 48h. We will also mention the downregulation of SOX2 at 48h that persists with co-treatment.

      - Brinkmann et al. (2025) also investigated cristae structure upon MCL-1 deletion in vivo and found no effect when MCL-1 was replaced with other Bcl-2 family members. It would be interesting to combine MCL-1 inhibition with overexpression of MCL-1 versus BCL-XL to reconsolidate some of the discrepant findings.

      While this is a great suggestion for future studies, there are some complications. Specifically, it is likely that the inhibitor may also target the overexpressed MCL-1 and thus, a mutant form is needed.

      To address this, we generated a Flag-tagged MCL-1 construct with a mutated BH3 domain, previously described by Kotschy et al. Nature 2016. We validated the construct in HeLa cells, but unfortunately the mutant protein appears to be significantly less stable than the WT construct, complicating analysis of this experiment.

      Minor comments:

      - In Supp. Fig. 1C the MCL-1 protein is shown both to run above 37kDa (upper panel) and below 37 kDa (lower panel). Could the authors please comment on why this is the case?

      The observed variation is caused by drift in the gel during electrophoresis. In Fig 1C, the protein ladder is on the edge of the gel, whereas in Fig 1E, the protein ladder is in the middle of the gel, and the last sample is on the edge and also exhibits edge drift.

      - In line 64 of the introduction the authors mention clinical trials yet do not give a citation for these trials making it hard to judge whether the content of these trials is actually related to the brain.

      This information is anecdotal, based on an Amgen press release.

      - MCL-1 as well as ACSL-1 are sometimes written without the hyphen both in the text and figures.

      We will carefully check the manuscript before submission.

      - Lines 92-94 and 106-108 essentially highlight the same existing knowledge gap. Maybe the content of these two paragraphs could be combined in order to avoid repetition.

      We thank Reviewer 2 for this suggestion. We will do this in the new version of the manuscript.

      - In Fig. 1A, the authors provide a schematic for their experimental design. While the figure legend is very thorough, some of this information (like the days of collection) could also be included in the figure itself. The same is true for schematics in the following figures.

      We agree with this and will incorporate the suggestion in the new version.

      - Fig. 2A includes a typo (analyze) but would maybe also be more suitable for the supplement figures or could even be combined with Fig. 1A as not much new content is added.

      We already incorporated these changes in the new version of the manuscript.

      - Regarding statistical analysis, could the authors please comment on why they did not consider one-sample t-tests suitable for the cases where control values were set at 1 (e.g. Fig. 4B, C for the relative expression).

      This is a valid suggestion. We will rerun RT-qPCR data using a one-sample t-test.

      - In lines 247f. the authors state that "inhibition of MCL-1 leads to [...] and disassembly of the MICOS complex as well as OPA1". This sounds like OPA1 is still cleaved upon MCL-1, which is not at all what the authors showed and further discuss. Rewording of the sentence would help in avoiding any misunderstandings.

      We agree with this comment and have now reworded the paragraph: "Inhibition of MCL-1 leads to structural collapse of the cristae likely due to the possible disassembly of the MICOS complex, as suggested by decreased MIC10 levels, and interruption of OPA1 cleavage, as suggested by decreased short-form OPA1, two scaffolds required for cristae maintenance."

      - In lines 210f. the authors state that "quantitative imaging increased the average and maximum volume of lipid droplets". While there is definitely a trend towards an increase for the maximum volume, the increase is in fact not statistically significant. This should be reflected in the wording.

      We have reworded this to "Quantitative imaging revealed a significant increase in average lipid droplet volume and a trending increase in maximum volume of lipid droplets."

      - In Fig. 6 the overlap between TBR2 and PAX6 is hard to judge when printed out. Including a zoom-in may make it easier to judge.

      We agree with Reviewer 2. In the new version of the manuscript, we will include panels that zoom into the cell populations we quantified. The current panels will go to a new Supplemental figure. We will also add the TUBB3 to the qPCR panel in the new version.

      - In Fig. 7 the color-coding is listed in the figure legend but is missing from the figure itself. If the authors could include this, as they did for the other figures, it would further improve this figure.

      We agree. We have specified the channel color in the new figure.

      - Line 238 should reference Fig. 7A, as Fig 7B does not exist.

      Thanks for catching this. It is already corrected

      - In the figure legends the authors state that biological replicates were used. Were technical replicates also performed?

      Yes, technical replicates were performed for RT-qPCR.

      Reviewer #2 (Significance (Required)):____ Significance

      The authors make use of a wide array of imaging techniques to further elucidate non-apoptotic roles of MCL-1. The study has the potential to offer new insights into mitochondrial biology on the level of basic research rather than translational. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. Nevertheless, the study offers additional knowledge on the role of MCL-1 in human neural stem cells, whereas previous research mostly focused on cardiomyocytes or cancer cells.

      Reviewer #3____ (Evidence, reproducibility and clarity (Required)):

      Summary: ____ In this study, Gama et al. describe a non-canonical role for the anti-apoptotic protein Myeloid Cell Leukemia-1 (MCL-1) in mitochondrial cristae organization and suggest a role of MCL-1 in regulating metabolism and neuronal differentiation. Using fluorescence microscopy imaging and electron microscopy, the authors show changes to mitochondrial morphology upon treatment with MCL-1 inhibitor S63845. MCL-1 inhibition results in altered protein and transcript levels of some key proteins involved in mitochondrial cristae organization and fatty acid metabolism. While some of the findings are interesting and indeed point towards a non-canonical role of MCL-1, several key conclusions of the authors are not sufficiently supported by the data shown in the manuscript.

      We thank Reviewer 3 for the careful evaluation of our manuscript. We appreciate the reviewer's recognition that our study identifies a potential non-canonical role for MCL-1 in mitochondrial cristae organization, metabolism, and neuronal differentiation. As with Reviews 1 and 2, we are encouraged that the reviewer finds these observations interesting and suggestive of previously unappreciated functions for MCL-1. We agree that stronger evidence is required to firmly link MCL-1 inhibition to specific changes in MICOS organization and metabolic regulation. In the revised manuscript, we will (i) more clearly distinguish between observations and mechanistic inferences, (ii) temper conclusions where appropriate, and (iii) incorporate additional analyses and controls to better substantiate the proposed model.

      Major comments:

      1. The authors try to disentangle the apoptotic and non-apoptotic role of MCL-1 through addition of a caspase inhibitor. However, I am not convinced that phenotypes found under the addition of caspase inhibitor are necessarily caused by non-canonical functions independent of apoptosis. It could also be that the observed changes happen upstream of caspase activation. In addition, many of the described finding, such as CPT1 expression changes, only happen in the presence of the caspase inhibitor. If one follows the logic of the authors, changes associated by non-canonical MCL-1 functions should happen under MCL-1 inhibition and caspase inhibition, but not with MCL-1 inhibition only____. __ The reviewer is right that we expected non-canonical functions to happen under MCL-1 inhibition and caspase inhibition. Our data with QVD shows that the cell death function of MCL-1 (i.e., inhibiting cell death effectors from initiating the caspase cascade) is not the main trigger of the phenotypes we report (cristae dysregulation and fatty acid oxidation disruption), however, cells without a functional cristae and/or defects in FAO, may not be able to survive long-term. Thus, QVD treatment preserves these cells that may not survive the dismantling of such an essential structure. To confirm this, we performed immunofluorescence of cleaved caspase 3 (__Figure 5 for reviewers). These results show that, indeed, MCL-1 inhibition at the time points of our study doesn't result in increased Caspase-3 activation. We reported similar results of MCL-1 inhibition in oligodendrocyte precursor cells (Gil and Hanna et al., Glia, 2025, PMID: 41420072).

      The authors show no data on the viability of the cells in response to the MCL-1 inhibitor. To exclude secondary effects of the inhibitor, at least some of the results should be validated with an MCL-1 knock down.

      We will include this experiment in our revised manuscript. To check the effects of MCL-1 knockdown on TBR2 positive cells, we tested 5 different ASOs for MCL-1. Knockdown efficiency with ASOs was very low (on average In Figure 1, the authors show immunofluorescence data of mitochondria and nucleus staining and conclude that MCL-1 inhibition alters mitochondrial morphology. Based on the images shown in Fig. 1, I do not think that individual mitochondria can be segmentd to measure their volume and length. In addition, some metrics such as mitochondrial content are not explained in the text or methods.

      We can achieve mitochondrial segmentation with a SoRa Spinning Disk Confocal Microscope, which has a lateral (XY) resolution of approximately 120 nm to 150 nm and an axial (Z) resolution of approximately 300 nm-320 nm. All images are first denoised prior to sharpening using the Richardson-Lucy deconvolution algorithm. Additionally, the FIB-SEM data are consistent with the IF data (both show increase in mitochondrial volume and surface area).

      We agree with Reviewer 3 that we need to explain some metrics in the revised version. We will specify the meaning of mitochondrial content (count of all mitochondria in FOV, not normalized to Hoechst).

      In Fig. 2 B-D, the authors show TEM and FIB-SEM imaging to demonstrate alterations in the cristae architecture upon treatment with MCL-1 inhibitor. However, based on the images shown, it looks that cristae area and density is reduced under S63845 treatment in TEM images, while the FIB-SEM data come to the opposite conclusion. In addition, the quantification of cristae volume quantified as cristae volume in percentage is unclear to me.

      We apologize for the confusion. No conclusions about the cristae area and density were made using the TEM data, because TEM data represent a single snapshot section of a mitochondrion without a discernible orientation. Cristae from TEM were described as "aberrant" and preliminarily revealed changes in cristae and were followed up with FIB-SEM, 3D reconstruction of intact mitochondria, and quantification of volume.

      In the new version of the manuscript, we will specify that the cristae volume is normalized to the volume of its respective mitochondria (i.e., how much of the mitochondrial volume is attributed to cristae).

      The change in CPT1/2 protein levels (Fig. 4) is interesting but does not directly proof that fatty acid oxidation is altered, as concluded by the authors. For this, the authors would need to directly measure fatty acid oxidation for example using Seahorse or metabolic tracing experiments. Also, to prove that the MCL-1 inhibition affects neural differentiation through fatty acid oxidation, a rescue experiment should be performed through CPT1 overexpression.

      We agreed that this is an important point. We have optimized the fatty acid oxidation test using Seahorse and will make sure to include it in the revised version of the manuscript.

      In Figure 6, the authors show decreased intermediate progenitor cells after MCL-1 inhibition by immunofluorescence staining. I am not convinced that this can be concluded from the data shown, since the concentration of intermediate progenitor cells is very close to the noise levels. Since the MCL-1 treated cells look much less sparse, I don't think the percentages can be compared (total counts are between 2-20). Although this data might give some indication that differentiation could be impaired, the measured effect could be very well due to lower viability of the cells. The authors need to control for this or come up with a different method for measuring differentiation.

      The number of TBR2 is low, but we disagree with the reviewer's assessment of noise levels. We focused on cells expressing only TBR2 and rigorously examined this population of cells. The percentages are compared to account for the lower density of the MCL-1i-treated cultures, as the IPC counts are normalized to the Hoechst total cell count within the FOV. Moreover, the immunofluorescence images are complemented with RT-qPCR, which shows significant downregulation of EOMES (gene encoding TBR2).

      Figure 7 is missing quantification

      We will include this quantification in the revised version of the manuscript.

      Reviewer #3 (Significance (Required)):

      General assessment____: The manuscript reports an interesting finding, which suggest a non-canonical role of MCL-1 in mitochondrial remodeling, regulation of fatty acid oxidation and neuronal fate. While this finding would be highly interesting and relevant, the presented data do not sufficiently support this conclusion. Further experiments would have to be performed to proof causality. ____ Advance: Should the authors manage to proof their hypothesis by additional experiments, this would indeed advance the field on mitochondrial remodeling and its effect on neuronal differentiation by

      identifying a novel molecular player. ____ Audience: mitochondrial biology, cell biology, developmental neuroscience Own expertise: mitochondrial biology, cell biology, advanced imaging techniques

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this study, Gama et al. describe a non-canonical role for the anti-apoptotic protein Myeloid Cell Leukemia-1 (MCL1) in mitochondrial cristae organization and suggest a role of MCL1 in regulating metabolism and neuronal differentiation. Using fluorescence microscopy imaging and electron microscopy, the authors show changes to mitochondrial morphology upon treatment with MCL1 inhibitor S63845. MCL1 inhibition results in altered protein and transcript levels of some key proteins involved in mitochondrial cristae organization and fatty acid metabolism. While some of the findings are interesting and indeed point towards a non-canonical role of MCL1, several key conclusions of the authors are not sufficiently supported by the data shown in the manuscript.

      Major comments:

      1. The authors try to disentangle the apoptotic and non-apoptotic role of MCL1 through addition of a caspase inhibitor. However, I am not convinced that phenotypes found under the addition of caspase inhibitor are necessarily caused by non-canonical functions independent of apoptosis. It could also be that the observed changes happen upstream of caspase activation. In addition, many of the described finding, such as CPT1 expression changes, only happen in the presence of the caspase inhibitor. If one follows the logic of the authors, changes associated by non-canonical MCL1 functions should happen under MCL1 inhibition and caspase inhibition, but not with MCL1 inhibition only.
      2. The authors show no data on the viability of the cells in response to the MCL1 inhibitor. To exclude secondary effects of the inhibitor, at least some of the results should be validated with an MCL1 knock down.
      3. In Figure 1, the authors show immunofluorescence data of mitochondria and nucleus staining and conclude that MCL1 inhibition alters mitochondrial morphology. Based on the images shown in Fig. 1, I do not think that individual mitochondria can be segmentd to measure their volume and length. In addition, some metrics such as mitochondrial content are not explained in the text or methods.
      4. In Fig. 2 B-D, the authors show TEM and FIB-SEM imaging to demonstrate alterations in the cristae architecture upon treatment with MCL1 inhibitor. However, based on the images shown, it looks that cristae area and density is reduced under S63845 treatment in TEM images, while the FIB-SEM data come to the opposite conclusion. In addition, the quantification of cristae volume quantified as cristae volume in percentage is unclear to me.
      5. The change in CPT1/2 protein levels (Fig. 4) is interesting but does not directly proof that fatty acid oxidation is altered, as concluded by the authors. For this, the authors would need to directly measure fatty acid oxidation for example using Seahorse or metabolic tracing experiments. Also, to prove that the MCL1 inhibition affects neural differentiation through fatty acid oxidation, a rescue experiment should be performed through CPT1 overexpression.
      6. In Figure 6, the authors show decreased intermediate progenitor cells after MCL1 inhibition by immunofluorescence staining. I am not convinced that this can be concluded from the data shown, since the concentration of intermediate progenitor cells is very close to the noise levels. Since the MCL1 treated cells look much less sparse, I don't think the percentages can be compared (total counts are between 2-20). Although this data might give some indication that differentiation could be impaired, the measured effect could be very well due to lower viability of the cells. The authors need to control for this or come up with a different method for measuring differentiation.
      7. Figure 7 is missing quantification

      Significance

      General assessment: The manuscript reports an interesting finding, which suggest a non-canonical role of MCL1 in mitochondrial remodeling, regulation of fatty acid oxidation and neuronal fate. While this finding would be highly interesting and relevant, the presented data do not sufficiently support this conclusion. Further experiments would have to be performed to proof causality.

      Advance: Should the authors manage to proof their hypothesis by additional experiments, this would indeed advance the field on mitochondrial remodeling and its effect on neuronal differentiation by identifying a novel molecular player.

      Audience: mitochondrial biology, cell biology, developmental neuroscience

      Own expertise: mitochondrial biology, cell biology, advanced imaging techniques

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Hanna et al. investigates non-apoptotic roles of MCL-1 in human neural stem cells and connects MCL-1 inhibition to mitochondrial cristae formation and beta-oxidation. Connecting these roles to brain development, the authors also show a reduction in the number of progenitor cells upon MCL-1 inhibition, independently of caspase activity. Throughout their work, the authors make use of an impressive array of imaging techniques.While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored.

      Major comments:

      • In Fig. 1B, the very same representative images are shown for both conditions (DMSO and S63845) at 48 hours.
      • For Western Blot analysis, it looks like the authors only quantified the band density of their proteins of interest without considering varying levels of control protein (Actin) levels. Normalizing the protein levels to actin would account for any differences in loaded protein amounts (although a Ponceau staining might be preferable still to exclude this). This is especially relevant for Fig. 4E, where actin levels visibly differ between the conditions.
      • The authors offer evidence that MCL-1 inhibition impedes proteolytic cleavage of OPA1-L into the OPA-1-S isoforms, yet do not explore the mechanism behind this. Since OPA1 is cleaved by both OMA1 and YME1L, determination of the levels of these proteases could help shed some light on the mechanism leading to cristae reorganization.
      • Generally speaking, while the authors show all those effects (cristae defects, FAO dysfunction) upon MCL-1 inhibition, it would be interesting to see whether any of those effects can be rescued by blocking FA import e.g. through carnitine palmitoyl- transferase 1a (CPT1a) inhibition with etomoxir to understand if they are downstream of altered Fa supply. This could affect cristae morphology through altered Cardiolipin biogenesis.
      • In line 262 the authors discuss that mitochondria lose metabolic function upon MCL-1 inhibition. This claim would require additional experiments. While the authors look at lipid droplet accumulation and FAO enzymes, there are many more aspects to mitochondrial metabolic function that should be investigated. While measuring the oxygen consumption rate via Seahorse might require additional resources (optional), measurements of ATP production, ROS generation or determination of the mitochondrial membrane potential should be feasible.
      • While the authors "propose a model in which MCL-1 associates with MICOS", they do not offer direct scientific to support this hypothesis. Co-immunoprecipitation experiments or e.g. proximity ligation assays would better support the proposed model.
      • While Fig. 7 shows representative images, quantification e.g. for the truncation of neuronal processes is missing.
      • In lines 219f. the authors state that they "observed a significant downregulation of PAX6 and EOMES at 24 hours that was not rescued by QVD co-treatment". While there is still a trend towards a downregulation, there is no statistical significance anymore. In fact, PAX6 levels almost mirror those of SOX2 which is not described as "downregulated" by the authors. In order to be more consistent, I would suggest rephrasing this part, or at least reword it to be less absolute.
      • Brinkmann et al. (2025) also investigated cristae structure upon MCL-1 deletion in vivo and found no effect when MCL-1 was replaced with other Bcl-2 family members. It would be interesting to combine MCL-1 inhibition with overexpression of MCL-1 versus BCL-XL to reconsolidate some of the discrepant findings.

      Minor comments:

      • In Supp. Fig. 1C the MCL-1 protein is shown both to run above 37kDa (upper panel) and below 37 kDa (lower panel). Could the authors please comment on why this is the case?
      • In line 64 of the introduction the authors mention clinical trials yet do not give a citation for these trials making it hard to judge whether the content of these trials is actually related to the brain.
      • MCL-1 as well as ACSL-1 are sometimes written without the hyphen both in the text and figures.
      • Lines 92-94 and 106-108 essentially highlight the same existing knowledge gap. Maybe the content of these two paragraphs could be combined in order to avoid repetition.
      • In Fig. 1A, the authors provide a schematic for their experimental design. While the figure legend is very thorough, some of this information (like the days of collection) could also be included in the figure itself. The same is true for schematics in the following figures.
      • Fig. 2A includes a typo (analyze) but would maybe also be more suitable for the supplement figures or could even be combined with Fig. 1A as not much new content is added.
      • Regarding statistical analysis, could the authors please comment on why they did not consider one-sample t-tests suitable for the cases where control values were set at 1 (e.g. Fig. 4B, C for the relative expression).
      • In lines 247f. the authors state that "inhibition of MCL-1 leads to [...] and disassembly of the MICOS complex as well as OPA1". This sounds like OPA1 is still cleaved upon MCL-1, which is not at all what the authors showed and further discuss. Rewording of the sentence would help in avoiding any misunderstandings.
      • In lines 210f. the authors state that "quantitative imaging increased the average and maximum volume of lipid droplets". While there is definitely a trend towards an increase for the maximum volume, the increase is in fact not statistically significant. This should be reflected in the wording.
      • In Fig. 6 the overlap between TBR2 and PAX6 is hard to judge when printed out. Including a zoom-in may make it easier to judge.
      • In Fig. 7 the color-coding is listed in the figure legend but is missing from the figure itself. If the authors could include this, as they did for the other figures, it would further improve this figure.
      • Line 238 should reference Fig. 7A, as Fig 7B does not exist.
      • In the figure legends the authors state that biological replicates were used. Were technical replicates also performed?

      Significance

      The authors make use of a wide array of imaging techniques to further elucidate non-apoptotic roles of MCL-1. The study has the potential to offer new insights into mitochondrial biology on the level of basic research rather than translational. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. Nevertheless, the study offers additional knowledge on the role of MCL-1 in human neural stem cells, whereas previous research mostly focused on cardiomyocytes or cancer cells.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This study claims that beyond its canonical anti-apoptotic function, MCL-1 has essential non-apoptotic roles in human neurodevelopment. Pharmacologic inhibition of MCL-1 in human neural stem cells disrupts mitochondrial inner membrane architecture by destabilizing cristae and the OPA1-MICOS complex, leading to swollen mitochondria with disorganized cristae. These structural defects impair fatty acid oxidation and lipid droplet homeostasis, linking cristae integrity to metabolic competence. Independently of apoptosis or proliferation, MCL-1 inhibition selectively depletes intermediate neural progenitors, indicating a direct role in lineage progression. Overall, the work positions MCL-1 as a key regulator of mitochondrial structure-metabolism coupling that instructs neural progenitor identity and human neurogenesis.

      Overall: The study does a good job of using (in most assays) caspase inhibition (e.g., QVD treatment) to block apoptotic responses induced by MCL-1 inhibition. As a result, many of the phenotypes caused by inhibition are likely to be independent of caspase activation. As a result, this manuscript would be of interest to researchers that study the topics of the BCL-2 family and cell death signaling, mitochondrial bioenergetics and dynamics, neurodevelopment, and cellular metabolism. However, as currently presented the manuscript is only descriptive and lacks mechanistic insight.

      Major Concerns:

      1) The authors only use a single MCL-1 inhibitor and never use other non-targeting BH3-mimetics (such as venetoclax) as negative controls. This seems like a missed opportunity to demonstrate that the phenotypes observed are MCL-1 dependent.

      2) There is no mechanism proposed in this study other than reliance upon QVD as not affecting the phenotypes. As submitted, the manuscript only can speculate that these phenotypes are due to non-apoptotic roles of MCL-1 inhibition. The authors have missed an opportunity to explore MCL-1's non-apoptotic functions directly.

      Other concerns exist that weaken the impact of the study.

      1. Figure 1 should include the fact that QVD inhibition (shown in Sup Fig 2) does not obviate the phenotype induced by pharmacological inhibition of MCL-1 on mitochondrial morphology.
      2. Figure 2 would benefit from evidence that caspase inhibition does not repress the phenotype on mitochondrial cristae morphology (volume and area). Furthermore, the FIB-SEM data are very hard to appreciate as the size precludes visualization of individual mitochondria.
      3. Figure 3 reports that MIC60 and OPA1 appear to be downregulated in response to MCL-1 inhibition, but these appear to be more significant only when QVD is added. Why would the phenotype be obscured in the non-QVD setting (Fig. 2B&C). How does MCL-1 inhibition lead to changes in MIC60/MICOS/OPA1? This seems quite preliminary at this point.
      4. The loss of MIC60 and OPA1 should repress electron transport chain function, are such impacts observed in the cultured cells? This could be shown by assessing oxygen consumption, etc. Such data would enhance the authors' conclusion that MCL-1 inhibition leads to defects in mitochondrial physiology.
      5. In Figure 4, the differences between transcripts (qPCR data) and protein (immunoblot) data are often confusing and not well explained. Why do the authors propose that mRNA expression is decreasing whereas the protein expression is increasing? Example CPT1. Furthermore, it is unclear what these data mean functionally? Is this reflective of enhanced lipid oxidation or simply a response to inhibition of fatty acid oxidation? Clarification of the impact of these findings is necessary.
      6. The increase in lipid droplet number induced by MCL-1 inhibition has been previously documented, but it is unclear whether this increase is related to an inability to oxidize lipid (defective fatty acid oxidation) that leads to increases in the cellular abundance or whether this indicates that MCL-1 inhibition leads to enhanced storage. Do other inhibitors of fatty acid oxidation lead to similar increases in lipid droplet size and abundance? Does QVD inhibition affect this phenotype?
      7. For Figure 6, while these data may be very meaningful, as presented they are very hard to appreciate. Insets that show the neuronal populations would help to convey the point that the differentiation is impacted. Also, are there other methods that could confirm these observations (qPCR to show changes in differentiation).
      8. Figure 7 is also very hard to appreciate. What is the reader to see? Can these be quantified? It seems that QVD may be rescuing in this figure, does this suggest that MCL-1 inhibition might be inducing death. All of this needs to be quantified.
      9. BCL-XL has also been implicated in affecting mitochondrial electron transport chain function (See PMID: 19255249, 21926988, 21987637). Can BCL-XL inhibitors affect any of the phenotypes associated here?
      10. Please be carefully avoid using the term "MCL-1 loss", when talking about pharmacological inhibition. Only genetic ablation (e.g. knockout, silencing, etc.) should be termed loss.

      Significance

      The study advances in human cells the impacts of MCL-1 inhibition. They replicate many impacts previously observed in mouse systems and refine analyses to impacts on MICOS complex, lipid droplet storage, and neuronal differentiation. While these findings are important and would be well received by a wide audience, the study fails to provide almost any mechanistic insight into how these phenotypes are being induced. The only common theme is that blocking caspase activation in many assays fails to block the phenotype.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their time and constructive comments on our manuscript.

      Reviewer #1 recognizes the importance of the question we address (namely, the early consequences of Wilms' tumour inducing mutations on kidney development in two models for different Wilms' tumour initiating mutations) and provides useful suggestions for improvement of the manuscript.

      Reviewer #2 raises the concern regarding the novelty of the study. We appreciate these comments and this implies the necessity of mainly textual changes we have to do to highlight the novel aspects of our study and findings and their significance in the revision of the manuscript.

      Reviewer #3 offers a generally positive assessment of the data, while suggesting that the work may be interpreted primarily from a developmental perspective rather than a Wilms' tumour-focused one. In the revision there is need to better emphasize how these perspectives are closely interconnected in the context of Wilms' tumour biology.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript addresses an important gap in Wilms tumor (WT) biology: what are the earliest pathogenic events following WT driver mutation induction, and how do these early developmental trajectories differ across genotypes? The authors provide a carefully staged and comparative analysis of two WT-associated genetic contexts-conditional Wt1 loss (using lineage-specific Cre drivers targeting nephrogenic (Six2-Cre) versus stromal (Foxd1-Cre) compartments, as well as a temporally controlled Wt1CreERT2 model targeting both lineages upon tamoxifen induction) and inducible LIN28B overexpression, and relate the resulting developmental phenotypes to two CSC marker paradigms derived from patient-based studies. A major strength is the precise, time-resolved description of the earliest initiating phenotypes (E12.5 and E18.5, with additional postnatal analysis for LIN28B) and the direct side-by-side comparison of how each genotype perturbs nephrogenesis. The authors conclude that Wt1 loss (especially in the nephrogenic lineage) leads to a severe developmental block accompanied by a disturbance of lineage identity ("lineage confusion"), whereas LIN28B overexpression causes a disturbed transition between uninduced and induced nephron progenitor cell (NPC) states, producing blastemal-like regions that persist postnatally. Using immunostaining for NCAM1, SIX2, CITED1, and ALDH1A2, the authors map marker combinations during normal kidney development and across mutant contexts, and propose that tumor-initiating alterations, most clearly in the LIN28B model, and more suggestively in the Wt1CreERT2 (Wt1CE) context, promote the emergence of a CSC-like population inferred to co-express all four markers (NCAM1+SIX2+CITED1+ALDH1A2+), a state not observed in normal kidneys.

      We thank Reviewer #1 for this correct and complete summary of our manuscript. This reviewer recognizes the current gap in our understanding of the origins of Wilms' tumors and appreciates the approach we have chosen to start filling this gap using two different mouse models.

      Overall, this study provides a particularly clear direct comparison of the earliest tumor-initiating events triggered by distinct WT-relevant driver alterations. While the manuscript does not yet offer a detailed molecular mechanistic framework explaining why these two mutations produce such divergent developmental and marker-state outcomes (which would further strengthen the work), the careful comparison and the conclusions drawn from it are meaningful and make an important contribution to our understanding of the developmental processes that can lead to Wilms tumor initiation.

      We thank this reviewer for recognizing the importance of a direct comparison of the early consequences of two different Wilms' tumour mutations. We agree we do not yet provide a mechanistic framework for these differences. Although these studies are on-going, they are outside the scope of this manuscript.

      *Major comment: 1. A central and highly emphasized conclusion of this manuscript is that tumor-initiating alterations induce a CSC-like population co-expressing all four markers (NCAM1, SIX2, CITED1, and ALDH1A2), and that this state is not observed during normal kidney development. Because this "quadruple-positive" population is a key mechanistic take-home message and closely linked to the overall conceptual model, the manuscript would be substantially strengthened by a direct, same-cell demonstration of co-expression of all four markers, rather than inference from consecutive sections. The authors state that they were unable to do so due to a technical limitation, namely, antibody host-species constraints that prevent co-detection of CITED1 and ALDH1A2 within the same section. *

      We agree that not being able to show co-expression of all 4 CSC markers is a serious limitation for the interpretation of our data. The reviewer suggests the following alternatives:

      *Several feasible approaches could address this limitation for example: - Identify an alternative antibody reagent from a different host species. *

      The 'problematic' antibodies are the ones staining for ALDH1A2 and CITED1, which are both Rabbit IgG. Alternative antibodies for ALDH1A2 are all raised in rabbit, so these will not solve this problem. For CITED1 we have now identified a biotin-conjugated antibody which could be used in additional co-staining. We propose to test this antibody for the revision of this manuscript.

      *- RNAscope / smFISH for in situ single-cell co-detection. *

      We are aware of these techniques as alternative for antibody staining. However, we have no experience with these techniques, nor do we have access to the required technologies. After discussions with collaborators with much experience in this technique, we realized the combination of the potential extensive optimization and costs does not make this a suitable alternative for the limited samples we have available.

      *- Single-cell RNA-seq (scRNA-seq) to test whether a bona fide quadruple-positive transcriptional state exists. *

      This could be an option but is itself a huge project and therefore outside the scope of this manuscript. We note that the known scarcity in single cell data might still complicate the detection of each marker in individual cells, especially for low-expressed TFs like Six2 and Cited1.

      *Overall, resolving this technical limitation would markedly increase confidence in one of the manuscript's most important claims and strengthen the proposed genotype-phenotype/CSC-marker framework

      *

      _As discussed above, we propose the t_ry the biotin-conjugated CITED1 antibody__

      • It is somewhat unexpected that the Six2-specific Wt1 deletion appears to produce a more severe phenotype than the tamoxifen-inducible Wt1CreERT2 approach, which is intended to target a broader Wt1-derived lineage (both nephrogenic and stromal). The Discussion offers several plausible, non-mutually exclusive explanations for this observation (e.g., timing, recombination efficiency/mosaicism, and the rescue contribution of "escaping" wild-type cells). It would be helpful to support at least one of these explanations experimentally. For example, the authors could quantify the extent of "escape" (percentage of non-recombined cells within the lineage) across embryos/timepoints to validate that mosaicism is indeed the cause of the milder phenotype. *

      We can address this experimentally by making use of the tdTomato Cre reporter that was included in our model which allows us to follow the fate of mutated and non-mutated cells in the lineage. We propose to combine Six2 antibody staining with the tdTomato signal to quantify the percentage of cells that has maintained Six2 expression and is therefore likely an escaping cell/nephron.

      Minor comments 1. Please clarify whether the difference shown in Fig. 2C is statistically significant, and report n, error bars/variation, the statistical test used, and p-values (if applicable).

      These details will all be added.

      • The authors note the presence of some SIX2+; tdTomato+ cells in Foxd1GC control kidneys. Given the expected stromal restriction of Foxd1 lineage labeling, please clarify the likely explanation and, if possible, indicate how frequent this is.

      *

      The reviewer here points to the important question regarding the origin and potential overlap between the stromal and nephrogenic lineages. This is not only an important but highly relevant question for origin and biology of Wilms' tumours, but also for normal kidney development. Kobayashi et al (2014) reported some contribution of the Foxd1 lineage in the Six2 lineage. Also Magella et al (2018) found some signs for this, as did a recent pre-print (Haghighitalab et al. 2026). There is even data suggesting that (part of) the renal stromal is derived from the paraxial instead of intermediate mesoderm in chicken (Guillaume et al. 2009) with some supportive data from mouse development as well (Levinson et al. 2005). The latter is especially interesting given the commonly found ectopic muscle differentiation in WT1-mutant Wilms' tumours (Miyagawa et al. 1998; Schumacher et al. 2003; Gadd et al. 2012). However, if a common, potentially Foxd1+/Six2+ double positive, progenitor exists, it will in the normal developing kidney be present before E11.5 and therefore the data in our current manuscript, or the unpublished scMultiome data, is not informative for this. We propose to discuss this in detail in the Discussion of the manuscript, and speculate on its relevance for Wilms' tumours.

      It is somewhat unexpected that Six2-specific Wt1 deletion appears to produce a more severe phenotype than the tamoxifen-inducible Wt1CreERT2 approach, which is intended to target a broader Wt1-derived lineage. The Discussion offers several plausible, non-mutually exclusive explanations (e.g., timing and/or recombination efficiency/mosaicism and the contribution of "escaping" cells). it would be helpful to support at least one of these explanations experimentally, for example by quantifying the extent of "escape" across embryos/timepoints and tamoxifen dosing.

      This was addressed above.

      *4. A careful proofreading pass is needed to ensure text-figure consistency, particularly for arrow annotations. For example, the Results text refers to "Fig. 1F, arrows," but arrows are not apparent in that panel. Likewise, the Results text mentions a "white filled arrow" in Fig. 2H, whereas the figure appears to show only open arrows. Please align the wording with the annotations actually shown in the figures. *

      We apologize for these errors and thank the reviewer for pointing them out. These, and all other textual and graphical errors, will be corrected in the new version of the manuscript.

      __Reviewer #1 (Significance (Required)): __

      Overall, this study provides a particularly clear direct comparison of the earliest tumor-initiating events triggered by distinct WT-relevant driver alterations. While the manuscript does not yet offer a detailed molecular mechanistic framework explaining why these two mutations produce such divergent developmental and marker-state outcomes (which would further strengthen the work), the careful comparison and the conclusions drawn from it are meaningful and make an important contribution to our understanding of the developmental processes that can lead to Wilms tumor initiation.

      We thank the reviewer for this comment, and like to emphasize that this is precisely the scope we intended with the current manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *Wilm's Tumor, a pediatric kidney cancer, is associated with gain or loss of activity of a number of genes including the loss of activity of the nucleic acid binding protein WT1 and gain of activity (enhanced expression at the mRNA level) of the RNA binding protein Lin28 which negatively impacts the maturation of the miicroRNA let-7, elevating levels of let-7 targets. Previous mouse studies have examined the impact of loss of Wt1 throughout within the nephron progenitor and interstitial cell compartments in capping mesenchyme that is thought to be the source of the tumor and of broad elevated expression in all kidney progenitors. *

      *In this manuscript, the authors have refined the loss of Wt1 to nephron or stromal progenitors and compared the phenotype to loss of Wt1 in both lineages examining cultured kidneys over a 72 hr period, in addition to uncultured kidneys examined at e18.5. A similar analysis was performed on Lin28 mutants. The analysis itself consisted of video imaging, limited immunostaining and histochemistry. *

      Reviewer #2 provides, in our opinion, a very limited overview of the contents of our manuscript. Our work presented here shows:

      • A detailed analysis of effects of Wt1 loss or activation of LIN28B in the following systems:
      • 5 embryonic kidneys
      • 5 embryo kidneys
      • P19 postnatal kidneys (for the LIN28B model)
      • In vitro cultured kidneys.
      • Time-laps analysis of in vitro cultured kidneys

      • In the case of the Wt1 knockout this was studied in nephrogenic, stromal, and the combination of nephrogenic and stromal lineages

      • Whereas our previous work (Berry et al. 2015) focused on different stages of nephron development, we now focus on the different lineages.
      • For the first time we study the different marker sets for Wilms' tumour cancer stem cells in their developmental context. Important take-home messages for this are:
      • The two published maker sets behave different in the normal developing kidney, and no cell types or developmental stages exist in the normal developing kidney that expresses all four markers
      • In contrast, after either of the two Wilms' tumour mutations are induced, we have strong, though not yet conclusive, evidence that this event induces cells that are positive for all four CSC markers, suggesting these quadruple-positive cells could be the functional CSCs. This mutation-dependent appearance of the CSCs would be a complete different mechanism for the origin of CSCs than believed for, for instance, leukemia and colorectal cancer, where an existing cell type with stem- or progenitor cell characteristics which already express the CSC markers picks up the tumour initiating mutations and thus starts behaving as CSC. The cascade our data suggests for the Wilms' tumour CSCs is much more complex.

      • To our knowledge this is the first direct and side-by-side comparison of the early effects of different Wilms' tumour mutations. This analysis clearly shows the differences in underlying biology for these two situations, and this can have important consequences for interpretation of patients data (which was historically almost always generated without knowing the initiating mutation) and opens the possibility of mutation-specific therapeutic possibilities and requirements. This is funcamentally different from the current patient stratification based on clinical outcome (favorable vs non-favorable histology) or very general molecular markers with clear biological consequences (like chr 1p status).

      • With respect to the mutation-dependent accumulation of CSC markers, although in both Wt1 and LIN28B models this seems to be happening, for the LIN28B model this seems to be the result of a simple developmental block, whereas for the Wt1 mutants this appears to be a lineage conversion phenotype. This is again something that has to our knowledge never been suggested for the origin of CSCs and even in the context of normal kidney development is almost unprecedented.
      • We optimize the use of the Wt1CreERT2 driver to target different lineages in the developing kidney using different timepoints for tamoxifen treatment. Not only does this have technical use, it also illustrates the complex role of Wt1 in the earliest stages of kidney development.

      Although the data presented are descriptive and do not yet provide a complete molecular mechanism, we believe they offer novel, unexpected and important insights that merit publication. We acknowledge that these aspects may not have been sufficiently clear in the original version of the manuscript, and therefore not being picked up by the reviewer. In response to Reviewer #2 comments, we propose a thorough rewrite of the Discussion of the manuscript to emphasize these aspects more.

      *While wholly qualitative and largely observational and descriptive, the limited data are of good quality and the conclusions drawn are reasonable. *

      We thank the reviewer for their compliments on the quality and conclusions of the data. While we acknowledge the reviewer's characterization of the study as quantitative and descriptive, we respectfully do not consider this to diminish its suitability for publication. We believe the dataset provides substantial and meaningful insights (definitely not limited), and we have clarified and expanded upon the novel aspects and significance of our findings as outlined above.

      *For the Wt1 study, most interesting would be in the loss of Wt1 from the NPC lineage. Clearly, there is already a significant phenotype at the time of study (E12.5) hence there is no strong insight into the earliest effects of Wt1 loss and how this might contribute to tumor formation. Quite what happens to these cells phenotypically is unclear given the limited set of markers used to look at the cells. Specific removal of Wt1 from the stromal lineage generates a milder phenotype, indicating a role for Wt1 there, but without a mechanistic analysis of the resultant products, the underlying mechanisms remain unclear. *

      As discussed in our response to reviewer #1, we agree on the lack of mechanism in the current study but emphasize here as well that although this is the topic of the on-going follow-up studies this is outside the scope of the current manuscript. We refer to the same response for our proposal for additional experiments for the revised version of the manuscript.

      *Wt1 removal from both lineages generated a phenotype less severe than removal from nephron progenitors (and previous data on "double lineage removal" with a Nestin1 cre), an indication that the genetic approach was not up to the task. *

      Respectfully, we would like to emphasize the practical challenges associated with the use of genetically modified mouse models for developmental (and other) studies. We doubt there are many Cre drivers that do exactly what they were intended to do, do only that, and at full 100% efficiency. Many Cre drivers are, when originally described, only described for the cell type they were intended for, and any other activities or limitations are missed or ignored. One could rightfully argue that is bad science, but unfortunately this is often the reality and the starting point for many in vivo analyses. And these are only the complications regarding the behaviour of the Cre driver, and does not even touch on issues like the biological processing of tamoxifen, and the stability of already existing mRNA and protein of the gene of interest in the context of, in this case, a rapidly developing organ. Simply dismissing technical complications as 'not up to the task' is in our opinion not the way forward for studying the origin of diseases.

      What is important, and what we demonstrate, is the realization of the limitations of a system, test them and where possible take them into account in the interpretation of data. In this case, instead of hiding the incompleteness of the Cre activity, we actually demonstrate this using retained staining of Wt1 and discuss this in the context of the different phenotypes. We have carefully tried not to overinterpret our data, and note that this reviewer does not give any specifics where this could be affecting our manuscript.

      We also like to stress that in the context of Wilms' tumour development the incomplete activity of this Cre driver could even increase the relevance of this model, since the early stages of Wilms' tumourigenesis in the (future) patient happen in a few mutant cells in the context of a further normal developing kidney. The effect of the normal cells in our model that we speculate about could also be important in the patient, we just don't have the technical possibilities to test this yet.

      *In some sense, one could regard this work as a pilot study, looking to optimize expensive and time-consuming mouse experiments to maximize insight (ie choose optimum model, address most informative time points, decide on analytical approaches). As a stand-alone paper, the work may not significantly advance our understanding of the topic. *

      As argued above, in our opinion this does not do justice to the work we describe in our manuscript.

      For example, can simple loss of Wt1 tells us anything about Wt? Yes Wt1 is lost in a subset, but even in these there are additional genetic mutations.

      Of course even in WT1-mutant tumours there will be additional mutations found in the tumour. In fact, it has been known for a long time that WT1-mutant Wilms' tumours select for oncogenic mutations in β-catenin with a surprising preference for specific mutations affecting Ser45. However, it is clear that in these tumours the loss of WT1 is the first, rate-limiting step (Fukuzawa et al. 2004; Li et al. 2004; Zirn et al. 2006; Uschkereit et al. 2007). These β-catenin mutations are selected for in an already WT1-mutant context. If we want to understand the full biology of the WT1 mutant tumours including the β-catenin mutation, we will first need to understand the effect of only losing WT1 because that is what provide the selective pressure for the next step (oncogenic mutation in β-catenin). The work described here is an essential first step in that.

      • For Lin12, there is no significant advance beyond the studies of the Daly lab. *

      As argued above, this is not correct. The following aspects were not covered in the original paper describing this model:

      • The in vitro analysis of control and LIN28B embryonic kidneys, including the time-lapse analysis demonstrating how the phenotype develops over time
      • The expression of the Wilms' tumour cancer stem cell markers and how these change as a result of the LIN28B activation
      • The direct comparison to the Wt1 loss phenotypes, and the demonstration these different mutations lead to fundamentally different biological phenotypes despite both eventually being classified as Wilms' tumours.

        I have no useful suggestions for improvement which would require a completely different approach to the problem from the start.

      We respect this reviewer's opinion, but based on the above we do not agree and maintain a different interpretation.

      Reviewer #2 (Significance (Required)):

      *The authors set out the goal in the introduction - to obtain a better understanding of the origins of Wilm's tumor. There doesn't appear to be an insight of cancer relevant significance beyond earlier studies. *

      Our work studies the very first steps in the development of Wilms' tumours. It will never be possible to study this in the (future) patient as these happen around wk 8-10 of pregnancy. By instead analyzing this in mouse models we show fundamental biological differences between different Wilms' tumour inducing mutations which is for sure relevant for patients, the interpretation of patient data (or more the difficulties with interpreting patient data if the initiating mutation or tumour class is not known). Moreover, the data provides new insights in the Wilms' tumour cancer stem cells, a preferred target for any therapy, and suggests the combination of all four known markers might be required to identify and study the true WT CSC. In our opinion such findings provide extremely relevant insights for the field.

      *To a readership now/too used to analysis at genome scales (genomic, transcription), this study might appear modest. *

      While we agree that genome-wide approaches can provide valuable insights, this doesn't mean that work that doesn't use them cannot provide important insights nor does it mean that every piece of work that does use them provides any new insights. We respectfully emphasize that the merit of a study should be assessed based on the data presented and their interpretation, rather than on the techniques that were used to obtain them.

      The target audience is unclear.

      Our target audience for this manuscript is everybody who is interested, for whatever reason, in the biology of Wilms' tumours.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *Wilms tumor arises from disrupted kidney development. Progenitor-like populations and cancer stem cell (CSC) fractions have been described in patient tumors, but how specific mutations alter embryonic programs to generate these states remains unresolved. *

      *Pop et al. model genotype-phenotype relationships during kidney embryogenesis. Using Six2- and Foxd1-driven Cre lines, they test the effects of Wt1 loss-of-function and Lin28b gain-of-function in nephron and stromal progenitors. Through explant imaging, histology, and immunofluorescence, they define mutation-specific effects on ureteric branching, cap mesenchyme organization, stromal composition, and nephron differentiation. *

      *Lineage-restricted Wt1 deletion produces distinct outcomes depending on whether nephron progenitors, stromal progenitors, or both are targeted. Lin28b overexpression causes delayed nephrogenesis and lobular organization resembling human Wilms tumor morphology, with expansion of blastemal-like populations. *

      This is a correct summary of this part of our data.

      These genetic removals of Wt1 and overexpression of Lin28b are useful for the field in understanding where and how Wt1 functions and whether Lin28b could be a model for Wilms' tumor.

      We agree that our data on Wt1 loss focusses on the role of Wt1 in normal kidney development, how its loss disrupts normal kidney development and how this could be important for Wilms' tumourigenesis. This includes but goes beyond being only relevant for the function of Wt1, it informs on the biology of WT1-mutant Wilms' tumours.

      There is in our mind no doubt whether the LIN28B model is a model for Wilms' tumours. Activating mutations in LIN28B are found in human patient tumours, and already in the original publication of this model (Urbach et al. 2014) it was convincingly shown that the phenotype in the kidneys after less than 3 weeks (when the animals have to be culled animal welfare reasons) represents early stages of Wilms' tumours. Our data presented here confirms this, and extends it with respect to the behavior of the CSC markers and the comparison to the Wt1 loss phenotypes.

      Whether the use of previously defined markers NCAM and Aldeflour serve the authors well or is a distraction is to be determine but it is unclear how useful these have been for understanding WT biology thus far. The authors describe these in the developing kidney in explants and in vivo.

      *Overall, the data support the view that distinct mutations generate different forms of lineage derailment but it is unclear how this links to Wilms tumor. It is better suited to dsescribe the role of an interesting protein Wt1 in kidney development and lineages therein. Connecting it to tumor biology would require further scrutiny of tumors. *

      Since CSCs are, according to the cancer stem cell model, the cells in a tumour that should preferentially be targeted, the exact identification of the CSC markers is directly important for the treatment of tumours. Our data analyzes two different sets of CSC markers, we show these cells label non-overlapping cell types in the normal kidney but that after mutation induction their expression changes and potentially become co-expressed in a single cell type (see our response to Reviewer #1 for more details on this). Identifying the developmental origins of CSCs in a tumour that is the direct result of disturbance of normal embryonic development (Hohenstein et al. 2015; Li et al. 2021) can be used as an entry point into understanding the biology of these tumours. Based on this we argue that although our analysis is on embryonic kidneys, their implications are highly relevant for the actual tumours and their treatment. We propose to further emphasize this in the Introduction and Discussion of our manuscript.

      *The study shows that removal of Wt1 in the stromal compartment has distinct phenotypes, which could be important for Wilms tumor biology as this is an poorly understood part of this tumor. *

      As already discussed in our response to Reviewer #1, we agree this is a potential important and poorly understood part of Wilms' tumours, directly for WT1 mutant tumours which are stromal-predominant, but potentially also for other tumours. We propose to further address this in the Discussion of the manuscript.


      *Major comments: *

        • This manuscripts uses elegant genetics to scrutinize the role of Wt1 and Lin28b. These stand out as difficult to conduct experiments and are of high value. *

      We thank this Reviewer's appreciation for the design, challenges and value of our data.

      In contrast, the section on ALDH1A2 and ALDEFLUOR activity is less integrated with the developmental framework.

      We discussed our reasons for focusing on the normal developmental context of the cells expressing the CSC markers in the previous section. Since the originally described CSC marker was activity for the AldeFluor enzymatic assay (Pode-Shakked et al. 2013) which we could not use on sections or kidney rudiments, we had to conclusively identify which ALDH isozyme is responsible for this signal in this context. There is much inconsistency about this in the literature, and whichever isozyme is important in these tumours might not be the causative factor in other tumours where AldeFluor labels the CSCs. We therefore use previously published microarray data from the group that originally identified the NCAM1/ AldeFluor combination as Wilms' tumour CSCs to identify ALDH1A2 as the culprit in this cancer type. With this knowledge we could move our analyses to antibodies, allowing co-staining with the other markers. Note that if the signal in these CSCs would have been the result of ALDH1A1 or ALDH1A3 which we show are expressed in the developing ureteric bud, the implications of this for the biology of the tumours would be totally different. We propose to discuss this aspects and its importance in more detail in the revised manuscript.

      *Much is unclear here e.g, antibody validation, rationale for performing these assays in explants rather than in vivo tissue, and the shift in Aldh1a2 staining pattern between E12.5 and E18.5, including reported nuclear localization.

      *

      We need to correct the reviewer on this remark, part of our data is using in vivo samples (E12.5 and E18.5) as well as cultured kidney rudiments. We will clarify which technique we use in the legends of the figures. We prefer to use this combination of techniques for several reasons: 1) the additional 3D information obtained from kidney rudiments can help with identifying specific developmental stages in the developing kidney; 2) due to the different fixation more antibodies work reliably in cultured rudiments than on paraffin frozen sections; 3) this is an important extra factor in the validation of antibodies; and 4) the possibility of culturing kidney rudiments on a time-lapse imaging system allows us to study phenotypes over time (this also greatly reduces the number of animals we need to study multiple timepoints in a developing system, an important aspect for the 3Rs). A good example of this in the timelapse data shown for the nephrogenic Wt1 knockout. The extreme outwards migration of the mutant cells (we show this using the tdTomato reporter) could only be identified in timelapse experiments, but is fully consistent with the sections of the corresponding E12.5 and E18.5 in vivo sections.

      We have no explanation for the shift to nuclear localization for ALDH1A2. We are not aware of any other publications showing this. We cannot rule out this is a technical artifact but based on all other expression data obtained with this antibody and their consistency with other publication we don't think this is very likely.

      *It is unclear how the manuscript is strengthened by this component. NCAM1 is referenced in the context of Wilms tumor CSCs, but unlike the rest of the manuscript which is mechanistic, it is unclear whether NCAM1 represents a mechanistic node in tumor initiation or merely a surface marker used for cell isolation? If NCAM1 functions just as a proxy for a progenitor-like state rather than a driver of tumor biology surely Wilms tumors will be full of progenitors or blastemal cells and many surface markers. It is unclear what strong evidence shows NCAM1 to be useful, this distinction should be stated. *

      Cancer stem cells are defined based in functional characteristics, i.e. the capability of reconstituting a complete tumour with all of its complexity after transplantation in immune-compromised mice. The markers are usually, indeed, merely proxy markers for a specific cell type in the tumour with this functional capacity. The same can be said in this case for the AldeFluor activity, it is used as CSC marker for many cancer types but we are not aware of any data on a functional role for this pathway in any of them. It would be a really interesting experiment to combine our models with an additional conditional knockout for Ncam1 or Aldh1a2 to see if the phenotype we describe here changes. The genetics of such an experiment with so many alleles are however horrendous, would come with an enormous surplus of animals and would take too long for the average project.

      The developmental framework presented argues that mutation-specific lineage derailment underlies tumor formation. Marker identity alone does not define pathogenesis. Perhaps reorganize this section to align it with the lineage-confusion model or removing it altogether would make the manuscript punchier?

      • *

      We propose to rewrite these parts to make this more clear.

      • *
      • The manuscript is highly focused on the nephrogenic compartment yet removes Wt1 from the stroma as part of one of the main lines of experiments. At several occasions, stromal changes are described qualitatively but using quantitative terms. As such, the manuscript currently comes across as having a bit of a black box where we cannot see the stroma beyond H&E stains. Could there additional antibody stains for stromal markers e.g., Pdgfra, Pdgfrb, or Meis1 to better visualize this compartment and perhaps enable quantification of changes?*

      We agree this lack of additional stromal markers is a limitation of the current manuscript. Our reason for so far not including these was our doubts on the usefulness and relevance for the complete renal stroma of many commonly used markers. The scarceness of detailed studies on the developing stroma was a big part of this doubt. Some preliminary tests show that Meis1 is not exclusively found in the developing stroma of the mouse kidney but is also expressed in early stages of the nephrogenic lineage, and is therefore not a good marker for this purpose. Pdgfra and Pdgfrb however seem to be expressed throughout the complete stroma and not in the other lineages. __We propose to analyze these two additional markers for the revised manuscript. __

      *Minor comments: *

      *Page 4, Lines 89-95: Remove the repeated sentence beginning "Although best known as a transcription factor...". *

      *Page 8, Line 164: Arrows referenced in Figure 1F are not visible. *

      *Page 8, Lines 164-166: The sentence may refer to Figure 1G; this figure is not otherwise cited. *

      *Page 18, Lines 413-414: (Pode-Shakked et al., 2013) is cited twice. *

      *Figure 2C: Error bars are missing. Indicate number of biological replicates. *

      *Gene nomenclature should be consistent throughout the manuscript. A mouse protein/gene is Six2/Six2. *

      *Use precise language when referring to protein detection rather than "expression." *

      *Standardize corticomedullary orientation across figures. *

      *Page 7, Lines 160-161: Provide immunostaining supporting WT1+/Tdtomato− stromal identity. Co-staining with Foxd1 would clarify lineage assignment. *

      *At E18.5 in the Six2-driven Wt1 mutant, WT1 signal is absent despite earlier stromal WT1+ cells. Clarify the fate of these cells. *

      *Comment on the lower recombination efficiency observed in Wt1CE at E11.5. *

      *Page 14, Lines 321-322: Determine how long CITED1 persists in WTCE mice. Co-staining with later differentiation markers would clarify whether progenitor retention coexists with nephron maturation. *

      Page 15, Lines 352-353: Clarify whether the sentence describing blastemal-like regions should reference Figure 5D.

      We thank the reviewer for these correction and other minor comments. We will address them in the revised manuscript. With respect to the remark regarding the gene nomenclature, until recently we were also under the assumption that mouse proteins only have the first character as capital. However, to our surprise we recently realized the official mouse nomenclature states that the protein (but not the gene) is in fact in all capitals. We refer for this to section 1.5.2 at https://www.informatics.jax.org/mgihome/nomen/gene.shtml.

      References.

      Berry RL, Ozdemir DD, Aronow B, Lindstrom NO, Dudnakova T, Thornburn A, Perry P, Baldock R, Armit C, Joshi A et al. 2015. Deducing the stage of origin of Wilms' tumours from a developmental series of Wt1-mutant mice. Dis Model Mech 8: 903-917.

      Fukuzawa R, Breslow NE, Morison IM, Dwyer P, Kusafuka T, Kobayashi Y, Becroft DM, Beckwith JB, Perlman EJ, Reeve AE. 2004. Epigenetic differences between Wilms' tumours in white and east-Asian children. Lancet 363: 446-451.

      Gadd S, Beezhold P, Jennings L, George D, Leuer K, Huang CC, Huff V, Tognon C, Sorensen PH, Triche T et al. 2012. Mediators of receptor tyrosine kinase activation in infantile fibrosarcoma: a Children's Oncology Group study. J Pathol 228: 119-130.

      Guillaume R, Bressan M, Herzlinger D. 2009. Paraxial mesoderm contributes stromal cells to the developing kidney. Dev Biol 329: 169-175.

      Haghighitalab A, Nosrati F, Dehghani-Ghobadi Z, Sayed M, Ahn C, Hu Y-C, Chung E, Lim H-W, Park J-S. 2026. A knock-in Six2Cre line reveals transient interstitial potential in nephron progenitors. bioRxiv: 2026.2002.2004.703893.

      Hohenstein P, Pritchard-Jones K, Charlton J. 2015. The yin and yang of kidney development and Wilms' tumors. Genes Dev 29: 467-482.

      Kobayashi A, Mugford JW, Krautzberger AM, Naiman N, Liao J, McMahon AP. 2014. Identification of a Multipotent Self-Renewing Stromal Progenitor Population during Mammalian Kidney Organogenesis. Stem Cell Reports 3: 650-662.

      Krishna A, Meynert A, Dolt KS, Kelder M, Mesropian A, Ewing A, Brouwers C, Claassens JW, Linssen MM, Sheraz S et al. 2026. Mutational scanning reveals oncogenic CTNNB1 mutations have diverse effects on signaling. Nat Genet 58: 366-375.

      Levinson RS, Batourina E, Choi C, Vorontchikhina M, Kitajewski J, Mendelsohn CL. 2005. Foxd1-dependent signals control cellularity in the renal capsule, a structure required for normal renal development. Development 132: 529-539.

      Li CM, Kim CE, Margolin AA, Guo M, Zhu J, Mason JM, Hensle TW, Murty VV, Grundy PE, Fearon ER et al. 2004. CTNNB1 mutations and overexpression of Wnt/beta-catenin target genes in WT1-mutant Wilms' tumors. Am J Pathol 165: 1943-1953.

      Li H, Hohenstein P, Kuure S. 2021. Embryonic Kidney Development, Stem Cells and the Origin of Wilms Tumor. Genes (Basel) 12.

      Magella B, Adam M, Potter AS, Venkatasubramanian M, Chetal K, Hay SB, Salomonis N, Potter SS. 2018. Cross-platform single cell analysis of kidney development shows stromal cells express Gdnf. Dev Biol 434: 36-47.

      Miyagawa K, Kent J, Moore A, Charlieu JP, Little MH, Williamson KA, Kelsey A, Brown KW, Hassam S, Briner J et al. 1998. Loss of WT1 function leads to ectopic myogenesis in Wilms' tumour. Nat Genet 18: 15-17.

      Pode-Shakked N, Shukrun R, Mark-Danieli M, Tsvetkov P, Bahar S, Pri-Chen S, Goldstein RS, Rom-Gross E, Mor Y, Fridman E et al. 2013. The isolation and characterization of renal cancer initiating cells from human Wilms' tumour xenografts unveils new therapeutic targets. EMBO Mol Med 5: 18-37.

      Schumacher V, Schuhen S, Sonner S, Weirich A, Leuschner I, Harms D, Licht J, Roberts S, Royer-Pokora B. 2003. Two molecular subgroups of Wilms' tumors with or without WT1 mutations. Clin Cancer Res 9: 2005-2014.

      Urbach A, Yermalovich A, Zhang J, Spina CS, Zhu H, Perez-Atayde AR, Shukrun R, Charlton J, Sebire N, Mifsud W et al. 2014. Lin28 sustains early renal progenitors and induces Wilms tumor. Genes Dev 28: 971-982.

      Uschkereit C, Perez N, de Torres C, Kuff M, Mora J, Royer-Pokora B. 2007. Different CTNNB1 mutations as molecular genetic proof for the independent origin of four Wilms tumours in a patient with a novel germ line WT1 mutation. J Med Genet 44: 393-396.

      Zirn B, Samans B, Wittmann S, Pietsch T, Leuschner I, Graf N, Gessler M. 2006. Target genes of the WNT/beta-catenin pathway in Wilms tumors. Genes Chromosomes Cancer 45: 565-574.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Wilms tumor arises from disrupted kidney development. Progenitor-like populations and cancer stem cell (CSC) fractions have been described in patient tumors, but how specific mutations alter embryonic programs to generate these states remains unresolved. Pop et al. model genotype-phenotype relationships during kidney embryogenesis. Using Six2- and Foxd1-driven Cre lines, they test the effects of Wt1 loss-of-function and Lin28b gain-of-function in nephron and stromal progenitors. Through explant imaging, histology, and immunofluorescence, they define mutation-specific effects on ureteric branching, cap mesenchyme organization, stromal composition, and nephron differentiation. Lineage-restricted Wt1 deletion produces distinct outcomes depending on whether nephron progenitors, stromal progenitors, or both are targeted. Lin28b overexpression causes delayed nephrogenesis and lobular organization resembling human Wilms tumor morphology, with expansion of blastemal-like populations. These genetic removals of Wt1 and overexpression of Lin28b are useful for the field in understanding where and how Wt1 functions and whether Lin28b could be a model for Wilms' tumor. Whether the use of previously defined markers NCAM and Aldeflour serve the authors well or is a distraction is to be determine but it is unclear how useful these have been for understanding WT biology thus far. The authors describe these in the developing kidney in explants and in vivo.

      Overall, the data support the view that distinct mutations generate different forms of lineage derailment but it is unclear how this links to Wilms tumor. It is better suited to dsescribe the role of an interesting protein Wt1 in kidney development and lineages therein. Connecting it to tumor biology would require further scrutiny of tumors. The study shows that removal of Wt1 in the stromal compartment has distinct phenotypes, which could be important for Wilms tumor biology as this is an poorly understood part of this tumor. My recommendation is that the manuscript would be considered for a major revision where it is more focused on kidney biology.

      Major comments:

      1. This manuscripts uses elegant genetics to scrutinize the role of Wt1 and Lin28b. These stand out as difficult to conduct experiments and are of high value. In contrast, the section on ALDH1A2 and ALDEFLUOR activity is less integrated with the developmental framework. Much is unclear here e.g, antibody validation, rationale for performing these assays in explants rather than in vivo tissue, and the shift in Aldh1a2 staining pattern between E12.5 and E18.5, including reported nuclear localization. It is unclear how the manuscript is strengthened by this component. NCAM1 is referenced in the context of Wilms tumor CSCs, but unlike the rest of the manuscript which is mechanistic, it is unclear whether NCAM1 represents a mechanistic node in tumor initiation or merely a surface marker used for cell isolation? If NCAM1 functions just as a proxy for a progenitor-like state rather than a driver of tumor biology surely Wilms tumors will be full of progenitors or blastemal cells and many surface markers. It is unclear what strong evidence shows NCAM1 to be useful, this distinction should be stated. The developmental framework presented argues that mutation-specific lineage derailment underlies tumor formation. Marker identity alone does not define pathogenesis. Perhaps reorganize this section to align it with the lineage-confusion model or removing it altogether would make the manuscript punchier?
      2. The manuscript is highly focused on the nephrogenic compartment yet removes Wt1 from the stroma as part of one of the main lines of experiments. At several occasions, stromal changes are described qualitatively but using quantitative terms. As such, the manuscript currently comes across as having a bit of a black box where we cannot see the stroma beyond H&E stains. Could there additional antibody stains for stromal markers e.g., Pdgfra, Pdgfrb, or Meis1 to better visualize this compartment and perhaps enable quantification of changes?

      Minor comments:

      Page 4, Lines 89-95: Remove the repeated sentence beginning "Although best known as a transcription factor...".

      Page 8, Line 164: Arrows referenced in Figure 1F are not visible.

      Page 8, Lines 164-166: The sentence may refer to Figure 1G; this figure is not otherwise cited.

      Page 18, Lines 413-414: (Pode-Shakked et al., 2013) is cited twice.

      Figure 2C: Error bars are missing. Indicate number of biological replicates.

      Gene nomenclature should be consistent throughout the manuscript. A mouse protein/gene is Six2/Six2.

      Use precise language when referring to protein detection rather than "expression."

      Standardize corticomedullary orientation across figures.

      Page 7, Lines 160-161: Provide immunostaining supporting WT1+/Tdtomato− stromal identity. Co-staining with Foxd1 would clarify lineage assignment.

      At E18.5 in the Six2-driven Wt1 mutant, WT1 signal is absent despite earlier stromal WT1+ cells. Clarify the fate of these cells.

      Comment on the lower recombination efficiency observed in Wt1CE at E11.5.

      Page 14, Lines 321-322: Determine how long CITED1 persists in WTCE mice. Co-staining with later differentiation markers would clarify whether progenitor retention coexists with nephron maturation.

      Page 15, Lines 352-353: Clarify whether the sentence describing blastemal-like regions should reference Figure 5D.

      Significance

      Overall, the data support the view that distinct mutations generate different forms of lineage derailment but it is unclear how this links to Wilms tumor. It is better suited to dsescribe the role of an interesting protein Wt1 in kidney development and lineages therein. Connecting it to tumor biology would require further scrutiny of tumors. The study shows that removal of Wt1 in the stromal compartment has distinct phenotypes, which could be important for Wilms tumor biology as this is an poorly understood part of this tumor.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Wilm's Tumor, a pediatric kidney cancer, is associated with gain or loss of activity of a number of gene including the loss of activity of the nucleic acid binding protein WT1 and gain of activity (enhanced expression at the mRNA level) of the RNA binding protein Lin28 which negatively impacts the maturation of the miicroRNA let-7, elevating levels of let-7 targets. Previous mouse studies have examined the impact of loss of Wt1 throughout within the nephron progenitor and interstitial cell compartments in capping mesenchyme that is thought to be the source of the tumor and of broad elevated expression in all kidney progenitors.

      In this manuscript, the authors have refined the loss of Wt1 to nephron or stromal progenitors and compared the phenotype to loss of Wt1 in both lineages examining cultured kidneys over a 72 hr period, in addition to uncultured kidneys examined at e18.5. A similar analysis was performed on Lin28 mutants. The analysis itself consisted of video imaging, limited immunostaining and histochemistry.

      While wholly qualitative and largely observational and descriptive, the limited data are of good quality and the conclusions drawn are reasonable. For the Wt1 study, most interesting would be in the loss of Wt1 from the NPC lineage. Clearly, there is already a significant phenotype at the time of study (E12.5) hence there is no strong insight into the earliest effects of Wt1 loss and how this might contribute to tumor formation. Quite what happens to these cells phenotypically is unclear given the limited set of markers used to look at the cells. Specific removal of Wt1 from the stromal lineage generates a milder phenotype, indicating a role for Wt1 there, but without a mechanistic analysis of the resultant products, the underlying mechanisms remain unclear. Wt1 removal from both lineages generated a phenotype less severe than removal from nephron progenitors (and previous data on "double lineage removal" with a Nestin1 cre), an indication that the genetic approach was not up to the task.

      In some sense, one could regard this work as a pilot study, looking to optimize expensive and time-consuming mouse experiments to maximize insight (ie choose optimum model, address most informative time points, decide on analytical approaches). As a stand-alone paper, the work may not significantly advance our understanding of the topic. For example, can simple loss of Wt1 tells us anything about Wt? Yes Wt1 is lost in a subset, but even in these there are additional genetic mutations. For Lin12, there is no significant advance beyond the studies of the Daly lab.

      I have no useful suggestions for improvement which would require a completely different approach to the problem from the start.

      Significance

      The authors set out the goal in the introduction - to obtain a better understanding of the origins of Wilm's tumor. There doesn't appear to be an insight of cancer relevant significance beyond earlier studies. To a readership now/too used to analysis at genome scales (genomic, transcription), this study might appear modest. The target audience is unclear.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript addresses an important gap in Wilms tumor (WT) biology: what are the earliest pathogenic events following WT driver mutation induction, and how do these early developmental trajectories differ across genotypes? The authors provide a carefully staged and comparative analysis of two WT-associated genetic contexts-conditional Wt1 loss (using lineage-specific Cre drivers targeting nephrogenic (Six2-Cre) versus stromal (Foxd1-Cre) compartments, as well as a temporally controlled Wt1CreERT2 model targeting both lineages upon tamoxifen induction) and inducible LIN28B overexpression, and relate the resulting developmental phenotypes to two CSC marker paradigms derived from patient-based studies.

      A major strength is the precise, time-resolved description of the earliest initiating phenotypes (E12.5 and E18.5, with additional postnatal analysis for LIN28B) and the direct side-by-side comparison of how each genotype perturbs nephrogenesis. The authors conclude that Wt1 loss (especially in the nephrogenic lineage) leads to a severe developmental block accompanied by a disturbance of lineage identity ("lineage confusion"), whereas LIN28B overexpression causes a disturbed transition between uninduced and induced nephron progenitor cell (NPC) states, producing blastemal-like regions that persist postnatally. Using immunostaining for NCAM1, SIX2, CITED1, and ALDH1A2, the authors map marker combinations during normal kidney development and across mutant contexts, and propose that tumor-initiating alterations, most clearly in the LIN28B model, and more suggestively in the Wt1CreERT2 (Wt1CE) context, promote the emergence of a CSC-like population inferred to co-express all four markers (NCAM1+SIX2+CITED1+ALDH1A2+), a state not observed in normal kidneys.

      Overall, this study provides a particularly clear direct comparison of the earliest tumor-initiating events triggered by distinct WT-relevant driver alterations. While the manuscript does not yet offer a detailed molecular mechanistic framework explaining why these two mutations produce such divergent developmental and marker-state outcomes (which would further strengthen the work), the careful comparison and the conclusions drawn from it are meaningful and make an important contribution to our understanding of the developmental processes that can lead to Wilms tumor initiation.

      Major comment:

      1. A central and highly emphasized conclusion of this manuscript is that tumor-initiating alterations induce a CSC-like population co-expressing all four markers (NCAM1, SIX2, CITED1, and ALDH1A2), and that this state is not observed during normal kidney development. Because this "quadruple-positive" population is a key mechanistic take-home message and closely linked to the overall conceptual model, the manuscript would be substantially strengthened by a direct, same-cell demonstration of co-expression of all four markers, rather than inference from consecutive sections. The authors state that they were unable to do so due to a technical limitation, namely, antibody host-species constraints that prevent co-detection of CITED1 and ALDH1A2 within the same section. Several feasible approaches could address this limitation for example:
        • Identify an alternative antibody reagent from a different host species.
        • RNAscope / smFISH for in situ single-cell co-detection.
        • Single-cell RNA-seq (scRNA-seq) to test whether a bona fide quadruple-positive transcriptional state exists.

      Overall, resolving this technical limitation would markedly increase confidence in one of the manuscript's most important claims and strengthen the proposed genotype-phenotype/CSC-marker framework 2. It is somewhat unexpected that the Six2-specific Wt1 deletion appears to produce a more severe phenotype than the tamoxifen-inducible Wt1CreERT2 approach, which is intended to target a broader Wt1-derived lineage (both nephrogenic and stromal). The Discussion offers several plausible, non-mutually exclusive explanations for this observation (e.g., timing, recombination efficiency/mosaicism, and the rescue contribution of "escaping" wild-type cells). It would be helpful to support at least one of these explanations experimentally. For example, the authors could quantify the extent of "escape" (percentage of non-recombined cells within the lineage) across embryos/timepoints to validate that mosaicism is indeed the cause of the milder phenotype.

      Minor comments

      1. Please clarify whether the difference shown in Fig. 2C is statistically significant, and report n, error bars/variation, the statistical test used, and p-values (if applicable).
      2. The authors note the presence of some SIX2+; tdTomato+ cells in Foxd1GC control kidneys. Given the expected stromal restriction of Foxd1 lineage labeling, please clarify the likely explanation and, if possible, indicate how frequent this is.
      3. It is somewhat unexpected that Six2-specific Wt1 deletion appears to produce a more severe phenotype than the tamoxifen-inducible Wt1CreERT2 approach, which is intended to target a broader Wt1-derived lineage. The Discussion offers several plausible, non-mutually exclusive explanations (e.g., timing and/or recombination efficiency/mosaicism and the contribution of "escaping" cells). it would be helpful to support at least one of these explanations experimentally, for example by quantifying the extent of "escape" across embryos/timepoints and tamoxifen dosing.
      4. A careful proofreading pass is needed to ensure text-figure consistency, particularly for arrow annotations. For example, the Results text refers to "Fig. 1F, arrows," but arrows are not apparent in that panel. Likewise, the Results text mentions a "white filled arrow" in Fig. 2H, whereas the figure appears to show only open arrows. Please align the wording with the annotations actually shown in the figures.

      Significance

      Overall, this study provides a particularly clear direct comparison of the earliest tumor-initiating events triggered by distinct WT-relevant driver alterations. While the manuscript does not yet offer a detailed molecular mechanistic framework explaining why these two mutations produce such divergent developmental and marker-state outcomes (which would further strengthen the work), the careful comparison and the conclusions drawn from it are meaningful and make an important contribution to our understanding of the developmental processes that can lead to Wilms tumor initiation.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Building a functional nervous system with the correct number of correctly specified and wired pre- and post-synaptic neurons requires a tight control of proliferation and cell survival during development. Focusing on the latter, Naser Alshami and colleagues in the laboratory of Alicia Hidalgo investigated the role of three neurotrophins and their respective Toll receptors in regulating neuronal survival in the visual system of Drosophila. A combination of expression and genetic loss-of-function and gain-of-function analyses revealed a novel role for the neurotrophin DNT2 (spaetzle-5) in controlling survival, dendritic spine formation, columnar axonal innervation of lamina neuron L1 expressing the receptor Toll-2.

      Major comments:

      • DNT-2 (spz-5) is described to be expressed in non-neuronal cells in the retina. However, the clustered arrangement of cells in cross-sections of the retina during pupal development (72h DNT-2 bottom panel) suggests that photoreceptors could be labeled by the reporter as well. This should be assessed by co-labeling experiments with Elav and/or mAb24B10. This is important, as photoreceptors may represent a presynaptic additional source of this neurotrophin to lamina neurons L1 (see below).
      • The most right hand panel of Figure 2A, first row, Toll-2, does not resemble a 24h APF optic lobe, and would need to be replaced by a correctly staged younger sample. This is relevant, as the image at 48 h APF shows extensive staining in a cluster at the medulla edge, suggesting that expression in medulla neurons may be more abundant at younger stages. The expression of Toll-2 in L3 neurons is not unambiguously demonstrated at the angle shown in Figure 1E first panel. Normally, MCFO clones should be able to reveal single cells instead of L1 and L3 together.
      • Related to this part of the study, the authors describe that Toll-2 is expressed by connecting neurons (L1, Mi1, Tm3...). This is solely based on RNAseq data. In the next paragraph, the authors describe that Toll-2 is expressed in L1 and DNT-2 in Mi1 (or related) medulla neurons. However, this is rather vague, as it is not clear what "related" medulla neurons refers to. As expression in Mi1 neurons is at the heart of this study, it would be essential to verify that Mi1 is included in the Toll-2-Gal4 and DNT-2 expression pattern. Indeed, in Figure 1A (right panel, row 2, 48 h APF) expression of DNT-2 in this neuron subtype may be visible. This should be verified by markers and single cell labeling such as MCFO. Finally, the authors mention that the DNT-2 ligand could "reach" the Toll-2 receptor. Here some more precision would be helpful, as it is not clear, how far neurotrophins would need to travel, and synaptic contacts may form too late for mediating survival. Furthermore, there seems to be also the possibility that there is a substantial contribution of autocrine signaling, if L1 neurons indeed express both the ligand and the receptor, which would need to be presented more clearly.
      • The next set of experiments uses over-expression of DNT-2 and DNT-3 in Toll-8 expressing neurons and assessment of survival rates in lamina neurons. Survival is determined by evaluating Dcp-1 signals rather than cell numbers. During normal larval development, 7 future lamina neurons per column are formed, two of which are eliminated by apoptosis. Thus, it would be important to verify whether it is these neurons that are surviving, using for instance Elav labeling; moreover lamina neuron specific markers (see Xu et al., 2024) may enable assessment whether the surviving neurons adopt specific lamina neuron identities in excess to the normal 5. It may also need to be considered that both neurotrophins could affect proliferation, if the numbers do not add up (as drivers may be active already during the third instar larval stage, this is possible).
      • Gain-of-function experiments are followed by loss-of-function experiments, assessing the impact of loss of DNTs in entire animal mutants on lamina neuron survival. As for the gain-of-function studies, it would be important to know, which lamina neuron subtypes are dying in the absence of DNT-2 and DNT-3. Indeed, Figure 4C reveals a substantial loss of L1 and L3 neurons. The over-expression experiments of the ligands using the Toll-2 driver are complicating matters, considering that the driver line is described to be a mutant in Toll-2, and the ligand is expressed in the same neuron, creating an autocrine signaling situation. Moreover, since Toll-2-Gal4 is used to drive expression in all lamina neurons L1 and L3, and entire optic lobes are assessed, the numbers should be around 1600 in wild type, but are not, suggesting that there might have been some cell death in controls. If there are more lamina neurons in the over-expression situation, it would mean that additional neurons are adopting L1 and L3 identity and thus reveal an additional direct role in cell fate determination by DNT-2 and Toll-2 or an indirect role considering the intricate interactions of lamina neurons to adopt specific identities via N signaling (see Xu et al. 2024). This would need to be assessed more in detail using for instance independent drivers or markers for lamina neuron subtypes.
      • The statement, that the remaining signal is coming from macrophages would need to supported by additional markers or described more carefully (as it could be glia).
      • Page. 9. The study reports a new interaction between DNT-2 and Toll-2 as its possible receptor. This is in part based on whole animal lethality, without providing quantification. This should be added.
      • Interestingly, knock-down of Toll-2 in all neurons led to increased cell death of lamina neurons, which cannot be overcome by over-expression of DNT-2 [FL]. It would be interesting to assess whether cell death is even higher by providing a statistical test comparing the knock-down of Toll-2 with the knock-down of Toll-2 and the simultanous over-expression of DNT-2 in Figure 4B.
      • Page 11 (also page 4). The authors describe that "connectivity of lamina neurons to medulla neurons takes place at 30-48 h APF and between medulla and lobula complex at 60-70h APF, and that expression of synaptic markers starts at 24h APF". It is not clear how the authors define connectivity (here and throughout the manuscript). While axonal and dendritic projections are established during the first half of pupal development, functional synapses are thought to be solely established from mid-pupal development onwards (early expression of synaptic markers may not be indicative of synapse formation as early as 24h APF). Moreover, there is no solid evidence yet that connections form sequentially between the lamina and medulla, and medulla and lobula and lobula plate neurons. This description would need to be adjusted. A reference for the occurrence of spontaneous activity (prior to synapse formation) should be provided: Akin et al. 2019. This is important for the subsequent interpretation that cell death overlaps with "connectivity": if this term is used to refer to synapse formation, this may not hold, and need adjustment.
      • Next, the study assesses the impact of loss of Toll-2 in surviving lamina neurons L1 on columnar axonal branching. The authors observed that loss of Toll-2 or over-expression of DNT-2 leads to extension of axonal branches into neighboring columns in the medulla neuropil layer M1. Here, it would be important to assess, whether this phenotype is really due to the loss of Toll-2 or the fact that neighboring L1 neurons are missing (even just one neighbor missing may be enough to create this defect as contact-dependent repulsion may no longer work, Millard et al. 2007). Toll-2 MARCM clones would be able to address this point. Without further experiments conclusions need adjustments. Finally, the authors conclude that this is an indication for a role in connectivity, but indeed, it is not clear whether these projections lead to abnormal connections, as the thin projections may not form synapses. This statement would need to be adjusted or supported by including synaptic markers, such as Brp/or TransTango experiments.
      • The study concludes with the statement that over-expression of DNT-2 FL and knock-down of Toll-2 alters dendritic spine formation of lamina neurons L1 in the lamina, it increases by over-expression and decreases in Toll-2 knock-down. This is in contrast to the axonal projections, where additional extensions are observed in both genotypes. This should be discussed, as the opposite effects may be in line with an indirect effect through loss of neighbors in medulla columns (in contrast to the lamina).
      • The authors conclude in the discussion (page 15 and 17) that DNT-2 by medulla neurons Mi1 or related medulla neurons is key to promoting survival of lamina neurons L1, expressing the Toll-2 receptor. However, the manuscript does not provide any direct evidence that it is indeed Mi1 (or other medulla neurons) playing this role. Considering that DNT-2 may also be provided by photoreceptors (see above), this statement may need support by additional experiments, such as a Mi1 specific or photoreceptor-specific over-expression of DNT-2 in a DNT-2 whole animal mutant background and assessment of survival of labeled L1 neurons (e.g. using the markers Svp/Zfh1 (Xu et al. 2024). Moreover, it would be crucial to indeed determine the time point of when cells are undergoing apoptosis and to assess whether this would coincide with synaptogenesis. Such experiments would allow to comment on a potential retrograde signaling process between connecting neurons.
      • The data and molecular and genetic methods are presented in detail in the Material and Method sections, as well as through pertinent supplementary tables. Similarly, the statistical analyses and sample numbers are indicated in the Material and Methods section and related supplementary tables. Table S4 does not indicate genotypes, but short versions of crosses, which should be corrected. Moreover, the list of fly lines seems to miss one line (UAS-mCherry). Concerning the antibody list, sources are not provided (unlike stated in the main text). In the Result section, differences are described simply as statistically significant decreases and increases, however, at times it would be useful to add some indication about the level of differences (percentage or fold times).

      Minor comments:

      • The figures are informative, however, their orientations are not as indicated horizontal or lateral, but often oblique and not consistent (e.g. Fig 1a, last four panels). This makes comparisons more difficult. The authors could possibly find more suitable optical sections and present images more consistently.
      • Please provide the correct reference describing the occurrence of spontaneous activity (Akin et al. 2019).
      • The manuscript would be more easily accessible to non-experts, if schematic drawings would be provided (neuron subtypes, model).
      • The final paragraph of the discussion about DNT-2 keeping connecting neurons together is not easily understandable, as the term "keeping together" remains undefined.

      Significance

      As in the vertebrate nervous system, insect nervous systems rely on neurotrophins to control the survival and correct wiring of neurons. This study proposes a model, that advances our understanding as to how a specific neurotrophin, delivered by a post-synaptic neuron, acting via a specific receptor, controls several developmental steps of one of its pre-synaptic partners. The study assessed the expression patterns of three neurotrophins and the Toll receptor family, using the T2A-Gal4 expression system, revealing their dynamic expression in photoreceptors, specific lamina and medulla neurons, as well as tracheae and glia during pupal development. A strength of the study represents the identification of DNT-2 and Toll-2 as interacting partners, which could be assessed by the extensive set of specifically generated reagents for this study (Gal4 drivers and mutants). However, a number of statements may require additional experiments to conclusively support the proposed model, in particular the action of DNT-2 and Toll-2 in Mi1 and L1, respectively. Moreover, it would be important to critically assess the timing of events of cell death, setting up of projection patterns and synapse formation.

      Audience: The outlined findings, if strengthened further, will be of interest for scientists studying nervous system development in vertebrates and invertebrates in general, and the action of neurotrophins in particular.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors investigate the role of neurotrophins in development of the Drosophila visual system, where multiple waves of cell death eliminate populations of neurons. Using transgenic reporter lines, the authors examine expression of four spz genes, which encode ligands and four Toll receptor genes in the optic lobe. The authors argue that two of these ligands, spz-3/DNT-3 and spz-5/DNT-2, are expressed in the optic lobe concurrently with a wave of cell death. Through gain- and loss of function studies they argue that DNT-3 promotes survival through Toll-8 whilst DNT-2 promotes survival through novel interactions with Toll-2 (with a focus on the lamina neurons). Finally, the authors argue that DNT-2 and Toll-2 regulate L1 neuron targeting and size.

      The involvement of neurotrophins in neuronal apoptosis has not been explored in detail in the visual system, making this a potentially interesting topic. However, there are several issues with the manuscript: The primary concern is that most of the authors' conclusions are not supported by the data presented. Many of the figures are difficult to interpret without the addition of appropriate markers. In addition, there are several inconsistencies both within the manuscript and within the context of existing literature that are not addressed. For example, the authors have not demonstrated that lamina neurons undergo apoptosis. This point is critical, as apoptosis, particularly that likely to be affected by neurotrophic factors, fine-tunes multicolumnar neuron numbers after they establish connections with their partners. On the other hand, unicolumnar neurons such as those in the lamina are not thought to need much fine-tuning because of their unicolumnar status. This distinction is not addressed in the manuscript. At present, the evidence presented does not convincingly support the authors' conclusions, and substantial further work is needed before the claims can be firmly established. Overall, these limit the credibility and significance of this study.

      Major comments:

      In general, the conclusions drawn about neuropil regions and specific cell identities are not supported by sufficient evidence. Across the figures, the identity of neurons and neuropils remains unclear due to the absence of subtype-specific markers or neuropil labelling that would place the data in the anatomical context of the optic lobe.

      (1) The authors claim that spz and Toll genes are differentially expressed in the optic lobe throughout pupal development. However, the data presented do not convincingly support this conclusion. The primary concern is that the Gal4 reporter insertions used to infer gene expression do not match published single-cell transcriptomic datasets (as noted by the authors). This discrepancy calls into question the accuracy of these reporters (or the transcriptomics) and raises uncertainty about which dataset can be trusted. Given that these data are inconsistent, gene expression should be assessed directly by HCR to draw reliable conclusions.

      Related comments: The authors claim that spz and Toll genes are expressed differentially in the optic lobe throughout pupal development. However, the characterisation of these expression patterns is lacking in the following aspects: - Time courses are inconsistent and incomplete across the different genes. - The conclusion that DNT-2/3 are expressed concurrently with cell death lacks evidence is not supported: DNT2/3 expression does not appear restricted to a clear time window and without co-staining for apoptotic markers this conclusion cannot be substantiated. - There is insufficient evidence to support the claim that spz-2 is specifically expressed in Mi1 neurons or that Toll-8 is specific to L4/L2. These claims are largely based on the transcriptomics data, which were inconsistent with the reporters in many other contexts. There are no neuron type markers costained to validate this claim. - Sample sizes are too low for some reporter lines (~2-4).

      (2) The authors quantified cell death using Dcp-1 signal volume and concluded that overexpression of DNT-2 or DNT-3 reduces cell death, while loss of either increases cell death in the lamina. However, the lamina is not labelled with an appropriate marker, nor are neuropils labelled to make the structure recognisable making it unclear whether this wave of cell death is truly occurring in the lamina. This is a recurring issue throughout the manuscript. Related comments: the reported values for Dcp-1 signal volume differ dramatically between the Gal4 (~40,000) and mutant (~2,000) controls, without an explanation for this discrepancy.

      (3) The claim that DNT-2 and DNT-3 promote survival via Toll-2 and Toll-8, respectively, is not sufficiently supported. The authors do not show that manipulation of each ligand specifically affects neurons expressing the relevant receptor. To substantiate this conclusion, the specificity of ligand manipulation should be tested. For example, through compensation experiments or by demonstrating that DNT-2 loss of function specifically induces cell death in Toll-2+ neurons.

      (4) The authors further investigate a potential interaction between DNT-2 and Toll-2, concluding that DNT-2 promotes survival in the lamina, medulla, and lobula complex through Toll-2. However, no evidence is presented showing altered levels of apoptosis in the medulla or lobula/lobula plate following DNT-2 manipulation. The conclusion that DNT-2 promotes survival via Toll-2 is based on a Toll-2RNAi;DNT-2FL epistasis experiment, but this interpretation does not account for the reduced expression of each construct due to Gal4 titration. This issue also applies to subsequent experiments (Figs. 5 and 6). A Gal4 titration control is required for each of these experiments to exclude this confound.

      (5) The authors conclude that DNT-2 production in Mi1 medulla neurons is required for L1 connectivity, survival, and morphology. However, these conclusions are based on manipulating DNT-2 and Toll-2 expression in L1 neurons, which does not directly test the requirement for DNT-2 in Mi1 neurons. In addition, the image quality in Figure 5 is problematic: brightness and contrast differ visibly between panels, making them challenging to interpret. Finally, the discussion states that "loss of function for DNT-3 (spz-3) causes lamina cell death that is not naturally compensated for by DNT-2," but no evidence is presented in this manuscript to support that conclusion.

      Other general comments:

      • Lack of labels for each individual image makes figures and text difficult to interpret/reference.
      • Gene names not italicised in figures
      • Inconsistent aspect ratio for graph axes (Obvious in figure 5C,D)
      • Image borders often not completely vertical/horizontal making them appear jagged in some figures e.g. Figure1A.
      • General typos e.g. (n=10 10 brains), inhibitor pf apoptosis p35
      • Lobula complex should not be used to refer to lobula plate and lobula plug at this stage as they are distinct neuropils

      Figure 1:

      • Arrows not explained
      • N-Cadherin absent from some images without explanation
      • First two images of 1B are identical.
      • Images in panel B at different timepoints and spz-4 is missing without explanation.

      Figure 3:

      • Dark spot in DNT-3FL 3A image.
      • Descriptions in text often lack specific details "Dcp1+ apoptosis in the lamina and outside the lamina too." Supplementary:
      • Figures S2-S6 lack labels denoting the timepoint of each figure
      • Figure S7 is illegible
      • Supplementary figure 8 is referred to as S9 in text and is missing genotype information as well as neuropil labels on images

      Significance

      Significance: This is an interesting topic of investigation, and the findings-if substantiated-would be of interest to the field of Drosophila visual system development and programmed cell death. In particular, it would be valuable to place these results in the broader context of programmed cell death as a mechanism for eliminating excess multicolumnar neurons during visual system development. However, at present the evidence presented does not convincingly support the authors' conclusions, and substantial further work is needed before the claims can be firmly established.

      Expertise:

      Drosophila neurogenesis, neuron specification, cell signalling

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Alshamsi et al. investigate the role of Drosophila neurotrophins (DNTs) and their Toll receptors in regulating neuronal apoptosis during optic lobe development.

      The authors provide compelling evidence that different DNTs and their Toll receptors are expressed in optic lobe neurons, and their activity regulates neuronal survival during optic lobe development. They further show that disruption of DNT/Toll signaling impacts neuronal morphologies.

      Comments

      ABSTRACT

      "Over-expression of DNT-3 (spz-3) and DNT-2 (spz-5) could rescue natural occurring cell death, whereas their loss of function caused cell death, showing that DNT-3 and DNT-2 can, and are required to, promote cell survival during optic lobe development."

      I find it more appropriate to say that OE prevents naturally occurring cell death because it inhibits a normal physiological process."rescue" would only be correct if there were an experimental or genetic loss (e.g., deletion of a survival factor) and you are restoring normal survival levels

      "Importantly, DNT-2 is expressed in Mi1 neurons and Toll-2 in connecting L1 neurons. We show that DNT-2 functions in concert with Toll-2, as Toll-2 RNAi knock-down prevented the rescue of apoptosis by DNT-2 over-expression and all Toll-2+ neurons were lost in DNT-2 mutants."

      I find this sentence very difficult to follow

      i suggest moving "Importantly, DNT-2 is expressed in Mi1 neurons and Toll-2 in connecting L1 neurons."

      to the next sentence. "by specifically investigating the Mi1 (DNT-2+) and L1 (Toll-2) synaptic partners", alterations in DNT-2 or Toll-2 expression levels impaired connectivity of L1 neurons at the M1 medulla layer and altered dendritic morphology of L1 neurons

      "As DNT-3 (spz-3) and DNT-2 (spz-5) are expressed in the medulla and they could influence both lamina and medulla neurons, this suggests that their function maintaining cell survival could enable the stabilisation or alignment of connected neurons across medulla columns."

      influence what? this is very vague and needs a temporal understanding of when neurons die, synapses are formed, and consider the phenotypes of the mutants and RNAi experiments.

      INTRODUCTION

      "Neuronal survival is maintained by neurotrophic factors secreted in limited amounts by target cells, leading to the survival of only those neurons that receive trophic support (Levi-Montalcini, 1987, Davies, 2003)."

      Perhaps the authors can be more precise, e.g. One mechanism by which Neuronal survival is regulated is through neurotrophic factors secreted in limited amounts by to be synaptic partners and adjacent cells

      "In this context, if neurotrophism is fundamental for nervous system development, it could have been enabled by evolutionarily conserved molecular mechanisms."

      I think the authors want to suggest that given the fundamental and widespread role of neurotrophism in nervous system development, it remains unknown if it relies on evolutionarily conserved molecular players.

      "Neurotrophins - NGF, BDNF, NT3, NT4 - are the main growth factors maintaining neuronal survival in the vertebrate nervous system (Levi-Montalcini, 1987, Lu et al., 2005). Importantly, they can also promote cell death, depending on context (Lu et al., 2005). They can promote cell survival via their Trk receptors and ERK and AKT downstream and via p75NTR and NFB downstream, or cell death via p75NTR, Sortilin, and JNK signalling instead (Lu et al., 2005). mechanisms."

      The authors should avoid the repeated use of "they"

      "There are six spz and nine Toll paralogous genes in Drosophila, which could play distinct functions. In fact, at least full-length DNT-1 and Toll-1 can promote cell death instead, and at least Toll-6 can promote either cell survival or cell death, depending on context (Foldi et al., 2017, Singh et al., 2025, Zhu et al., 2008) . Importantly, mature DNT-1 and DNT-2 with Toll-6 and Toll-7 are required for and can promote neuronal survival during circuit formation in the embryonic ventral nerve cord (McIlroy et al., 2013, Zhu et al., 2008)."

      I find this paragraph difficult to follow. I suggest the following editing, which the authors might want to consider:

      "There are six spz and nine Toll paralogous genes in Drosophila, which could play distinct functions. In fact, at least full-length DNT-1 and Toll-1 can promote cell death instead, while and at least Toll-6 can promote either cell survival or cell death, depending on context (Foldi et al., 2017, Singh et al., 2025, Zhu et al., 2008) . Importantly, mature DNT-1 and DNT-2 with Toll-6 and Toll-7 are required necessary and sufficient to promote for and can promote neuronal survival during circuit formation in the embryonic ventral nerve cord (McIlroy et al., 2013, Zhu et al., 2008).

      "During this time (24-50h APF), connectivity between photoreceptors, lamina and medulla neurons is established; this is followed by medulla neurons connecting to lobula neurons;"

      I find this sentence misleading, if not incorrect. If by connectivity the authors mean synaptogenesis, for all that is known, synaptogenesis has been shown to occur from from mid-pupal development (P50) onwards If by connectivity, the authors mean the targeting of specific neuropiles and layer organization, it is also incorrect that lamina and medulla organization precedes the connectivity between medulla and lobula neurons. These processes are all concurrent. Can the authors please clarify?

      "and by 72h APF cell death has greatly diminished and synaptogenesis completes connectivity patterns, in preparation for adult eclosion at 96h APF (Millard and Pecot, 2018, Melnattur and Lee, 2011, Hadjieconomou et al., 2011, Kurmangaliyev et al., 2020) "

      Kurmangaliyev et al., 2020 is probably not an appropriate citation here as it mostly deals with transcriptional programs of circuit assembly in the developing optic lobe

      "Thus, 24-48h APF is a critical period to maintain necessary lamina and medulla neurons alive in the optic lobe."

      Perhaps the authors want to revisit this sentence and explicitly say that 24-48h APF is a period where apoptosis defines cell numbers

      "The development of the Drosophila visual system has been well described (Holguera and Desplan, 2018, Melnattur and Lee, 2011, Millard and Pecot, 2018, Hadjieconomou et al., 2011, Behnia and Desplan, 2015)."

      I find that Behnia and Desplan, 2015 is not appropriate, as it is a review that describes the characterization of neuronal circuits underlying visual modalities in the fly brain, and not their development. The following reviews dealing with different aspects of neurogenesis, neuropile development and circuit formation are likely more relevant: Bakshi et al Current Opinion in Neurobiology 2025 Malin et al PNAS 2021 Ngo Dev Bio 2017

      "R7 and R8, together with lamina neurons target to medulla layers M6 and M3, respectively, organizing into medullar columns that respond to the same point in visual space and maintain retinotopy."

      I find this sentence misleading because lamina neurons do not target M6 and only L3 targets M3.

      "Medullar interneurons also form connections across multiple layers, where each layer represents different visual features (Fischbach and Hiesinger, 2008, Millard and Pecot, 2018)."

      Here, Behnia and Desplan, 2015, Matsliah et al Nature 2024, Borst and Groschner , Annu. Rev. Neurosci. 2023, and even Schnaitmann et al J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2020 are better references that review feature detection and circuit organization in the optic lobe.

      "Neurons within the lobula complex integrate signals from the medulla and project to the optic glomeruli in the central brain and motor outputs to enable appropriate behavior (Behnia and Desplan, 2015, Borst et al., 2020, Courgeon and Desplan, 2019b)."

      I find Courgeon and Desplan, 2019b is not very appropriate here, as it reviews the coordination of neural patterning in the Drosophila visual system. More adequate and relevant manuscripts and reviews are Wu et al. elife 2016, Tanaka and Clark, 2022, Lapoetke et al. Neuron 2022 , even Zhao et al. elife 2024

      "Spz-5 is well known as Drosophila neurotrophin-2 (DNT-2), and as Spz-3 has been proposed to have neurotrophin functions which we expand on and demonstrate here, we refer to Spz-3 as DNT-3 (Zhu et al., 2008, Coutinho-Budd et al., 2017, Sun et al., 2024, Ballard et al., 2014, Ulian-Benitez et al., 2017)."

      This last paragraph of the introduction seems out of place, and partly redundant with page 3. Perhaps the authors would like to finish the introduction with a paragraph highlighting the major findings and conceptual advance of the manuscript? This seems to be a good and natural way of following their previous sentence "Here, we asked whether neurotrophin family ligands encoded by DNTs (spzs) and their Toll receptors could regulate cell survival during neural circuit formation, in the Drosophila pupal optic lobe."

      RESULTS

      I suggest the following edit: To ask whether DNTs (spzs) are expressed in the pupal optic lobe, we generated T2A-Gal4 driver lines for spz-1,3,4,5 fly lines , crossed them to 10xUASmyrGFP or 20UAS6xmCherry reporter flies, and analysed resulting progeny optic lobes during development and in the adult with anti-GFP antibodies, as required (Figure 1A).

      Also, If the lines were also crossed with mcherry, mentioning anti-GFP antibodies is incomplete.

      "spz-1MIO2318-T2A>myrGFP and spz-1MIO2318-T2A>6xmCherry revealed expression in a few centrifugal neurons in the lobula complex that projected to the lamina,"

      At which stage?

      "subsequently medulla neurons and abundant arborisations into the lobula complex and medulla."

      Subsequent to what? I am sorry, but I don't understand this description.

      "Expression from spz4MI5678 -T2A->myrGFP was not detected until 72h APF" Data before 72h APF is missing Where was it expressed? Which cell types? Which neuropiles?

      "And then was found in the medulla and lobula complex and followed by the trachea." At which stage? What was followed by the trachea? I think the authors mean that at later staged (in the adult) expression was restricted to the trachea in both medulla and lobula

      "spz-3-T2A>6xmCherry (hereby named DNT-3) was highly expressed in non-neuronal retinal cells and medulla neurons;"

      At what stage?

      "And subsequently in the trachea and possibly glia."

      The authors could and should explain how they reach this conclusion. Given that no cell type specific markers were used, this identification was likely based on morphological features.

      "Finally, DNT-2-T2A>6xmCherry (spz-5) was found in medulla neurons, which could be tentatively identified as Mi1 medulla neurons (Nern et al., 2025) by 48h APF, and this pattern was maintained."

      The authors suggest these neurons are Mi1 based on what? Also, Fischbach's Cell Tissue Res (1989) seminal paper is probably worth mentioning.

      "Abundant cells expressed spz-1 in the lobula complex (Figure 1B, left) and medulla (Figure 1B, right), and DNT-3 (spz-3) and DNT-2 (spz-5) in the medulla (Figure 1B) during optic lobe development. DNT-3 (spz-3) and DNT-2 (spz-5) were expressed in distinct non-neuronal cells in the retina, seen in Multi Colour Flip Out (MCFO) clones (Figure 1C)."

      This sentence is misleading. These experiments allow the authors to conclude that spz+ neurons innervate these neuropiles. It does not allow the authors to conclude that spz molecules localize to the neuropiles. The authors should revise these claims in the main text and relevant figure legends

      "To visualize the distribution of Tolls in the optic lobes during pupal development, we used the GAL4 lines previously described (Li et al., 2020), driving expression of the reporter myrGFP (Figure 2A)."

      Same comment as above

      "Using MCFO clones as well as myrGFP, we could identify some of the Toll-8+ neurons as Lawf1, feedback neurons projecting from the medulla to the lamina (Figure 2B), and L2 and L4 lamina neurons (Figure 2B, C); Toll-6+ cells to include L3 and L4 lamina neurons (Figure 2B,D); and Toll-2+ neurons as L1 lamina neurons which target to M1 and M5 medulla layers and L3 lamina neurons that project to M3 (Figure 3B,E) (Hakeda-Suzuki and Suzuki, 2014, Behnia et al., 2014)."

      Hakeda-Suzuki and Suzuki, 2014, Behnia et al., 2014 are not the most appropriate references. Instead, I suggest the authors should cite Fischbach's Cell Tissue Res (1989).

      "Overall, the expression in the scRNAseq dataset (Kurmangaliyev et al., 2020) of the spz ligands and Toll-8 (also known as Tollo) data were less consistent with the cell biology data, whereas the expression of Toll-1, -2 and -6 confirmed cells seen with the cell-biology based reporters"

      It is perhaps more accurate to refer to the cell-biology based reporters as translation reporters, which is what T2a based Gal4 drivers are.

      "Most particularly, Toll-2 mRNA (synonym 18w) was found in L1 and L3 lamina neurons over time, plus also in L5 at 24h APF, and Toll-6 mRNA was found in L2, L3, L4 over time, plus also in L1 at 24h (Supplementary Figure 1-6)."

      This sentence is difficult to read and could be considered poorly written for several reasons including: - "plus also" is repetitive. "Plus" and "also" serve the same function. - Inconsistent punctuation. The lack of commas before "and Toll-6 mRNA..." makes the sentence feel unbalanced. -Vague time reference. "Over time" is imprecise. It's unclear whether it means during development, at multiple timepoints, or something else. Also regarding scRNAseq analysis: - the authors mention "We compared our reporter-based profiles with published scRNAseq datasets of the optic lobe through development (Kurmangaliyev et al., 2020, Ozel et al., 2021)" however the Ozel dataset doesn't seem to be used.

      Also, from the Material and Methods section:

      "The data were imported as a Seurat object, and cells corresponding to specific timepoints (e.g., 24 h, 36 h) were subsetted based on the provided metadata. Dimensionality reduction was carried out using principal component analysis (PCA), followed by Uniform Manifold Approximation and Projection (UMAP) embedding computed on the first 30 principal components. Cluster annotations provided by the original authors were used for all cluster-level analyses and visualisations."

      The Kurmangaliyev dataset is already processed. I am probably missing something here, but is not obvious to me why the authors performed PCA again

      "To conclude, at the time of naturally occurring cell death (0-48h APF), Toll-1 is highly expressed throughout the optic lobe; Toll-2, -6 and -8 are expressed in the medulla; Toll-8 and Toll-6 are prominently expressed in the lobula complex, and Toll-6 and Toll-2 are prominent in the lamina."

      This is misleading. These experiments allow the authors to conclude that toll+ neurons innervate these neuropiles. It does not allow us to conclude that toll molecules localize to the neuropiles. The authors should revise these claims in the main text and relevant figure legends

      "To ask whether DNTs can promote cell survival in developing optic lobes, we over-expressed DNT-2 (spz-5) and DNT-3 (spz-3) and visualized dying cells with the apoptotic marker anti-Dcp1 at the peak of naturally occurring cell death (24h APF)."

      These experiments were done using Toll8-Gal4 and nsyb-Gal4 drivers. What's nsyb-Gal4 expression during development? Is the expression of this driver consistent with the conclusions drawn from these experiments?

      "To test whether DNT-2 could promote cell survival during optic lobe development, we over-expressed full-length DNT-2FL or cleaved DNT-2CK in all neurons with nsybGAL4. This reduced the incidence of Dcp1+ apoptosis in the lamina and outside the lamina too (Figure 3C-D and Supplementary Figure S9C,D)."

      How do the authors explain that DNT-2CK reduced the number of Dcp1+ cells?

      "We generated DNT-3 (spz-3) loss of function mutants by P-element mobilization. DNT-2 and DNT-3 loss of function mutants caused considerable cell debris in the medulla and lobula complex, which compromised the analysis in this region, so we focused on the lamina."

      Perhaps, rather than stating that the mutants caused considerable cell debris, the authors could say that the mutants displayed considerable cell debris

      More importantly, I have concerns with the data from these experiments (Figure 3). Dcp1 signal volume intensity using Imaris. In all panels (A,C,E) the segmented images do not match the raw DCP1 staining, raising concerns on how much can one rely on this quantification. Could this be because the Dcp1 staining shown is a single z plane and the segmentation is a 3d rendering? The authors should carefully and robustly explain this discrepancy which is present in all images where Dcp1 signal volume intensity was quantified.

      Also, could the authors explain why the quantifications in Figure 3B and 3C differ by an order of magnitude (10×) from those in panel 3D? Please look at the WT control, there is a 10X difference in signal volume intensity.

      "Toll-2pTVGAL4 flies are heterozygous mutant for Toll-2, and, remarkably, in combination with DNT-2 homozygous mutants resulted in semi-lethality, revealing a functional interaction between these two genes.

      I cannot entirely follow this conclusion. I understand the authors propose that the combination of partial loss of Toll-2 and full loss of DNT-2 affects viability, more than either mutation alone. Is this what they mean? Can the authors comment on the viability of DNT-2 mutants?

      "Macrophages loaded with HisYFP and distributed mostly between the retina and lamina could be observed across these samples (Figure 4C), suggesting they had engulfed dead cells" How do the authors identify these YFP+ cells as macrophages?

      "Together, these data show that DNT-2 functions as a ligand for Toll-2 to maintain the survival of neurons in the lamina, medulla and lobula complex during optic lobe development." While the results from Figure 4 showing that DNT-2 acts as a ligand for Toll-2 to support neuron survival are solid ( in particular panels C-F), it doesn't necessarily mean all neurons die directly due to loss of Toll-2 signaling. It is plausible that Neurons that express Toll-2 die because they lose critical survival signals. The death of these Toll-2-expressing neurons could then cause a cascade effect, where neighboring or connected neurons die indirectly due to loss of trophic support, disrupted circuits, or secondary damage. So, the observed cell death in multiple regions may be a combination of direct effects on Toll-2-positive neurons and indirect effects on other neurons. "In the Drosophila pupa, connectivity of lamina to medulla neurons takes place at 30-48h APF, and between medulla and lobula complex at 60-70h APF (Kurmangaliyev et al., 2020, Millard and Pecot, 2018, Pecot et al., 2014, Hadjieconomou et al., 2011)." I have the same comment as mentioned above regarding the timing of connectivity. If by connectivity, the authors mean the targeting of specific neuropiles and layer organization, it is incorrect that lamina and medulla organization precedes the connectivity between medulla and lobula neurons. These processes happen concurrently. Can the authors please clarify?

      "Importantly, the expression of synaptic markers starts at 24h, peaks at 60h APF and spontaneous neuronal activity takes place at 48h APF, meaning that at least some neural circuits are already connected by this point (Kurmangaliyev et al., 2020)."

      The correct placement of the reference to Kurmangaliyev et al. is after "peaks at 60h APF " A reference to Akin et al and Bajar et al when referring to PSINA is missing.

      "Thus, the period of naturally occurring cell death overlaps with connectivity" I think the authors mean that the period of cell death is concurrent with the development of synaptic connectivity.

      "L1 neurons normally project along columns that can be labelled with mAb24B10, and target to layers M1 and M5 of the medulla." The authors should mention that 24b10 labels the photoreceptors, providing a spatial reference to identify medulla columns

      "Interestingly, Toll-2RNAi knock-down did not alter the phenotype caused by DNT- 112FL overexpression, and impaired targeting to the same extent as each genetic manipulation alone (Figures 5B,D)."

      How do the authors interpret these results? And perhaps the authors would like to explain the rationale of overexpressing DNT-2fl in L1 neurons, that do endogenously express it.

      "To conclude, these data show that DNT-2 and Toll-2 are required for appropriate connectivity of L1 neurons to target Mi1 medulla neurons at M1 medulla layer."

      The authors characterize neuronal morphologies but do not directly assess connectivity using synaptic markers. While defective morphologies are likely to impact connectivity, the conclusion that DNT-2 and Toll-2 are required for appropriate connectivity should be tempered. The authors should revise their wording to reflect that their data support morphological defects rather than direct evidence of altered synaptic connectivity.

      DISCUSSION

      "In fact, throughout animal development, between 50% (e.g. in Drosophila) and 80% (e.g. in vertebrates) are lost to naturally occurring cell death"

      "of neurons" is missing before "are lost"

      "Consistently with these findings, we have shown that the survival of L1 neurons depends on DNT-2 functioning together with Toll-2."

      The authors state that "the survival of L1 neurons depends on DNT-2 functioning together with Toll-2." It seems that what they intend to convey is that DNT-2 acts as a ligand for Toll-2. The text should be clarified to explicitly indicate this ligand-receptor relationship rather than implying a cooperative function.

      "These data demonstrate that DNT-2 and Toll-2 function together in visual system development."

      Since the authors did not use tub-GAL80 or another temporal control to restrict gene expression specifically to development, the observed phenotypes could reflect combined developmental and adult effects. Throughout the text, the authors should revise their wording to acknowledge this limitation.

      "Finally, interference with the normal levels of DNT-2 and Toll-2 also impaired axon targeting and dendritic morphology, consistently with the coupling between cell survival with connectivity."

      The authors state that interference with normal levels of DNT-2 and Toll-2 "impaired axon targeting and dendritic morphology, consistently with the coupling between cell survival and connectivity." This statement seems tautological, as neurons that die cannot form connections. The authors should clarify whether they are referring to a specific mechanistic link beyond this obvious consequence.

      "Our findings are consistent with prior reports that had shown the maintenance of cell survival to be required during neural circuit formation."

      This statement seems tautological. It is generally expected that neurons must survive in order to contribute or be part of neural circuits. The authors should clarify if they are highlighting a specific mechanistic insight beyond this obvious requirement.

      "In the medulla, Dm8 medulla neurons are produced in excess and are eliminated during connectivity to their R7 inputs (Courgeon and Desplan, 2019a). This is enabled by the cell surface molecular tags DIP in yDm8 binding Dpr11 in yR7, during synaptic matching (Courgeon and Desplan, 2019a)."

      The statement that "Dm8 medulla neurons are produced in excess and are eliminated during connectivity to their R7 inputs" is both unclear and inaccurate. It is not evident what is meant by "during connectivity." Moreover, Courgeon and Desplan (2019a) show that Dm8 neurons undergo cell death before or by P40, prior to synaptogenesis. The authors should correct this statement and clarify the timing and mechanism of Dm8 neuron elimination.

      "Importantly, the maintenance of cell survival takes place during connectivity, and enables synaptic matching between connecting neurons."

      It is unclear what is meant by "during connectivity." Moreover, both Courgeon et al. (2019a) and Xu et al. (2018, 2022) show that these neurons (e.g., Dm8, Dm12, Dm14) undergo cell death before or by P40, prior to synaptogenesis. The authors should clarify the timing and mechanism of cell survival and revise this statement accordingly.

      "By contrast, it has also been proposed that apoptosis plays a minor role in cell number control during visual system development, depending instead on cell proliferation and spatial patterning through Dpp/BMP signalling (Malin et al., 2024). However, those findings were based on events taking place at the larval third instar wandering stage, when proliferation and spatial patterning are prevalent, whereas apoptosis peaks in pupa."

      It seems that the authors are trying to suggest that different mechanisms control cell numbers at different developmental stages: during larval neurogenesis (L3), cell numbers are regulated primarily by proliferation and spatial patterning, whereas in the pupal stage, neuronal survival via apoptosis plays a key role. If this is the intended point, it should be stated more clearly, as the current comparison to Malin et al. (2024) is confusing and does not make this distinction explicit.

      "However, Toll-2 mutant MARCM clones generated in the pupa result in a dramatic loss of lamina neuron dendrites and aberrant axonal navigation in the medulla, as well as widespread neuronal loss (Li et al., 2020)." This statement is puzzling. Neurogenesis occurs during the larval stage until P15, and MARCM requires progenitor cell division. The authors should clarify how MARCM clones were generated during pupation and provide the relevant experimental details in the Materials and Methods. "Importantly, connectivity between L1 and medulla neurons takes place between 20-48h APF, during the period of naturally occurring cell death, and spontaneous activity in the optic lobe takes place at 48h, meaning at least some circuits are connected by then (Kurmangaliyev et al., 2020)."

      The authors state that "connectivity between L1 and medulla neurons takes place between 20-48h APF," but no reference is provided for this timing. To my knowledge, no study has directly demonstrated this, so the authors should either provide supporting evidence or revise this statement. Additionally, citing Kurmangaliyev et al. (2020) for spontaneous activity in the optic lobe is not appropriate for this point, as PSINA was originally described by Orkun Akin.

      "When altering DNT-2 or Toll-2 levels, L1 axonal terminals in the medulla were misrouted, rather than being confined to a single column. This is reminiscent of the phenotypes caused by alterations in Dscam and Fez levels (Millard et al., 2007, Peng et al., 2018)."

      The authors note that altering DNT-2 or Toll-2 levels causes L1 axonal terminal phenotypes reminiscent of phenotypes caused by changes in Dscam and Fez levels (Millard et al., 2007; Peng et al., 2018). However, they only reference these previous studies without discussing whether there could be a shared mechanism. While these comparisons are interesting, the manuscript would benefit from either a deeper discussion of potential mechanistic links or a clear statement that the comparison is purely phenotypic.

      "As DNT-2 is secreted in medulla neurons and Toll-2 is expressed along neurons that connect in the medulla (e.g. L1, Mi1, Tm3, Dm9, T4), DNT-2 could help keep connecting neurons together during dynamic cellular events in development."

      This sentence is poorly written and vague. It is unclear what "keep connecting neurons together" means mechanistically. Likely, DNT-2 is secreted by postsynaptic medulla neurons (e.g., Mi1), whereas Toll-2 is expressed in neurons innervating the medulla (e.g., L1, Mi1, Tm3, Dm9, T4). The authors should rephrase this sentence to clearly convey their mechanistic and cellular interpretation.

      FIGURES

      Figure 1 Arrowheads point to what? OL orientations should be described in the figure captions

      Figure 5 The title of Figure 5 ("Altering the levels of DNT-2 and Toll-2 modifies L1 axon targeting at medulla M1 layer") is misleading. The correct layer targeting is preserved; what changes is the pattern of unicolumnar innervation. When DNT-2 or Toll-2 levels are altered, L1 neurons innervate multiple columns rather than maintaining their normal single-column specificity. The title should be revised to reflect that the defect is in columnar specificity rather than layer targeting.

      Supplementary Figures S1 to S6 Combined UMAPs showing the expression of spz-1, -3 (DNT-3),-4 and -5 (DNT-2) and Toll-1, -2 (18w), -6 and -8 (Tollo) in distinct cells over time.

      Information about which UMAP corresponds to which time point is missing

      MATERIALS AND METHODS Genetics. Please see S1 Table for the list of the stocks used and Table S3 for full genotypes for each experiment. Table S3 is not the full genotypes for each experiment. This information is partly available in the Source Data excel file

      Significance

      During development, neurons are initially produced in excess. One mechanism by which the final neuronal numbers are refined relies on trophic support, which maintains the survival of necessary neurons, while excess neurons are eliminated.

      In the Drosophila optic lobe, a wave of apoptosis occurs during pupation, peaking at a critical period thought to be essential for establishing final neuronal numbers and supporting proper neural circuit formation. However, the mechanisms underlying this developmental process remain poorly understood. In this manuscript, Alshamsi et al. investigate this wave of apoptosis by examining the role of Drosophila neurotrophins (DNTs), which are encoded by spätzle (spz) paralogous genes and signal through Toll receptors to regulate neuronal survival during brain development.

      The authors use translation reporters to demonstrate that DNTs and Toll receptors are differentially expressed across various neuronal types innervating all optic lobe neuropils during development. They then focus on DNT-3 and DNT-2, which they show to be necessary for controlling neuronal numbers, likely by maintaining neuronal survival during pupal stages.

      Notably, the results reveal a previously uncharacterized interaction between DNT-2 and Toll-2. The findings suggest that DNT-2 acts as a neurotrophic factor produced by medulla intrinsic neurons, binding to the Toll-2 receptor expressed in other neurons innervating the medulla. By examining Toll-2+ L1 neurons, which are postsynaptic in the lamina and presynaptic in the medulla, the authors provide compelling evidence that DNT-2/Toll-2 signaling regulates L1 neuronal numbers.

      Interestingly, the authors also show that disruption of DNT-2/Toll-2 signaling affects L1 axonal and dendritic morphologies. However, the extent to which changes in neuronal survival and neuronal morphology are mechanistically or cellularly linked is not addressed. These findings are consistent with previous reports showing that DNTs and Toll receptors regulate neuronal survival in embryonic, larval, and pupal ventral nerve cords, as well as in the adult. Importantly, DNTs and Tolls can also promote cell death, highlighting their dual role in controlling neuronal number and circuit formation.

      While the data is for the most part solid, I have concerns regarding the execution, interpretation of certain results and the conclusions drawn. Additionally, references to previous work are often incorrect or incomplete; I provide several examples below along with non-exhaustive suggestions for improvement. Finally, the manuscript would benefit from careful text revision and improvements to figure presentation, for which I also offer non-exhaustive guidelines.

      Overall, I would recommend this manuscript undergo revision before its publication and I would be happy to reassess a revised version that addresses the comments above.

  3. May 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-01861

      Corresponding author(s): Manuela, Baccarini

      1. General Statements

      We were happy to learn that all three reviewers found the paper novel and of interest for a cell biology audience. They especially highlighted the carefully conducted screen, whose results will be integrally published with this paper and will be of use for scientists interested in lysosome biology. The revised manuscript contains key validation experiments (antibody/KO controls, lysosome positioning quantification, live-cell actin dynamics) to strengthen our central conclusions.

      2. Point-by-point description of the revisions

      This section is mandatory. Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript.

      Reviewer 1

      • The colocalization of endogenous PLEKHG3 and LAMP1 as depicted in figures 3B and 3C (data from fixed cells) is not convincing. PLEKHG3 appears to be present on cortical actin structures as opposed to being colocalized with LAMP1 on lysosomes. And related to this point:

      • There is no apparent colocalization of PLEKHG3 and lysotracker in the movie S5.

      Answer:

      We do not claim that the two structures always colocalize, but that PLEKHG3 is a LAMTOR3 vicinal protein that co-enriches with a subset of peripheral lysosomes at focal adhesions (FAs)/protrusions. The images in Figure 3C are schematic for how PLEKHG3-high/low and LAMP1-high/low regions were defined for quantification. We agree with the Reviewer and with the previous literature that PLEKHG3 main localization is to cortical actin structures, as reaffirmed by the strong cortical actin localization shown in Figure 3F of the original version and in Figure S2C (HEK293T cells) and in Figure S3A in HeLa cells in the revised version. We have clarified the text referring to Figure 3F on page 26, line 11-14 as follows:

      “Immunofluorescence experiments showed the reported colocalization of endogenous PLEKHG3 (Figure S2C in HEK293T cells, Figure S3A in HeLa cells) and GFP-PLEKHG3 with cortical actin structures and the partial localization of LAMP1-positive vesicles to these structures in correspondence with vinculin-positive focal adhesions.”

      Live imaging in GFP-PLEKHG3-expressing cells (including movie S5, and particularly the stills of the leading edge in Figure 4F) further supports this spatial association without implying obligate colocalization. We added explicit language (p. 27, lines 19-21): “Following a single cell over time, we could observe that __a subset of __lysosomes appears to travel to PLEKHG3 accumulation sites and specifically move into developing protrusions.”

      • The authors should also confirm the specificity of the PLEKHG3 antibody in immunofluorescence using control and PLEKHG3 siRNA in untransfected cells that have not been transfected with GFP-PLEKHG3 (as is shown in Fig. S2C). Numerous antibodies recognize the overexpressed protein but do not recognize the same protein at endogenous expression levels.

      Answer: To assess the specificity of the antibody for endogenous PLEKHG3 we have used HEK293T cells, which based on the fact that PLEKHG3 is most highly expressed in neuronal cells (https://www.proteinatlas.org/ENSG00000126822-PLEKHG3/tissue#expression_summary) should yield a clearer endogenous signal. The pattern of PLEKHG3-positive bands is similar to that observed in HeLa cells, and only the band around 250 kD is clearly reduced by the siPLEKHG3. The IF images show a selective loss of the PLEKHG3 signal in correspondence of actin filaments close to the plasma membrane, whereas the nuclear signal is preserved, and therefore to be considered non-specific (revised Figure S2B-C).

      Extract from revised Figure S2B-C: ____PLEKHG3 KD test in HEK293T cells: B) Western blot of HEK293T cells showing downregulation of PLEKHG3 expression upon siPLEKHG3 treatment compared to siScr. Bar plot shows quantification of PLEKHG3 bands from immunoblot above. Error bars = SEM, n=3. * = p values according to student's t-test. C) Immunofluorescence images of HEK293T cells. siPLEKHG3 shows drop in PLEKHG3 intensity in the periphery of the cell and less colocalization with Phalloidin. Scale bar = 50 µm. Line plots show intensity profiles of Phalloidin (green) and PLEKHG3 (red) along the white lines in the merged inset images. Scale bar = 10 µm.

      In addition, we have now generated a PLEKHG3 CRISPR-Cas KO in HeLa cells. The results, shown in revised Figure S2G-I, confirm the specificity of our reagents and the localization of PLEKHG3 seen in HEK293T cells.

      Extract from revised Figure S2G-I: G) Immunoblot and quantification of HeLa PLEKHG3 KO cells represents the degree of PLEKHG3 depletion achieved using different guides compared to WT cells transfected with empty vector (EV). The most potent guides (8-9) are boxed in red. H) Immunofluorescence images of WT and PLEKHG3 KO8 cells reveal an overall drop in PLEKHG3 intensity and the specific loss of PLEKHG3 signal at the periphery of the cells. I) Quantification of PLEKHG3 intensity as displayed in H for two KO cell lines compared to WT cells. Dots represent individual data points of each of the three-color coded replicates; diamonds represent the mean of each replicate; black bars represent the mean ± the SEM of three biological replicates; * = p values according to 2-way-ANOVA. Error bars = SEM.

      These results establish that the peripheral cortical signal is specific for endogenous PLEKHG3; the nuclear signal is non-specific. Loss of PLEKHG3, however, had no effect on lysosomal distribution, morphometric parameters (see revised Figure S4A-C) or protrusive activity (see revised Figure S6E-F) compared to WT cells.

      Extract from revised Figure S4A-C: ____PLEKHG3 KO does not influence lysosomal distribution or cell morphometry: A) __Quantification of lysosomal distribution in WT compared to two KO cell lines. N ≥ 50 cells in three biological replicates. __B) Schematic representation of analysis of cell shape descriptors as referred to in C). Left picture shows the calculated outline in yellow based on which the cell area and circularity are calculated. Right picture shows the minor and major cell axis which, calculated as fraction, result in the aspect ratio of the cell. Scale bar = 50 µm. C) Quantification of cell morphometric parameters Area, Circularity and Aspect ratio. N ≥ 50 cells in three biological replicates. Black dots represent mean of each biological replicate. Statistical analysis according to student’s t-test. Error bars = SEM.


      • The claim that "peripheral accumulation of lysosomes inhibits protrusion formation and limits cell motility" should be tested more rigorously using the RAMP method, preferably in living cells. Other approaches, such as overexpression/siRNA of Arl8b and other motor adaptors, such as SKIP/PLEKHM2, can be used to alter lysosome positioning and confirm this central finding of the manuscript. The authors could also consider including additional mechanistic data in order to comprehend how lysosome positioning controls cell motility. For instance, the RAMP approach could be employed to investigate cortical actin dynamics upon repositioning of lysosomes to the peripheral/perinuclear region.

      Answer: We have purchased the RAMP system from Addgene and adapted the reporters to express fluorophores compatible with our color setup in the different respective cell lines (HeLa GFP/GFP-PLEKHG3 as well as in HeLa PLEKHG3 KO cells. Unfortunately, we’ve experienced difficulties with imaging due to suboptimal efficiency of the double transfection necessary to introduce the RAMP system into the cell lines. The LAMP1 and the KIF plasmids were co-expressed at very different levels in the cells, leading to the need for high laser power in both channels, which too often resulted in cell death. Additionally, the redistribution of the lysosomes after biotin addition was incomplete and slower than initially expected, which made it impossible to investigate cortical actin dynamics.

      To gain some mechanistic insight, we have performed further live cell imaging analyses comparing PLEKHG3 WT vs KO cells and GFP vs GFP-PLEKHG3 cells expressing a combination of BFP-LifeAct (to visualize F-actin) with either control mCherry or mCherry-KIF1A to move lysosomes to the periphery.

      • In all experiments, locking lysosomes in the periphery drastically reduces membrane dynamics (protrusion formation and retractions).
      • PLEKHG3 remains colocalized with LifeAct under KIF1A (Fig. S6C–D), indicating that the reduced protrusiveness is upstream or independent of PLEKHG3’s cortical localization
      • Live-cell BFP-LifeAct imaging revealed that KIF1A-driven peripheral lysosomes reduce protrusion formation/retraction and dampen cortical actin dynamics in both WT and PLEKHG3 KO cells (Fig. S6A–B, E–G; Movies S12–15; S18–21), indicating that these phenomena are independent of PLEKHG3. We believe these data, together with the quantitative lysosome repositioning and FA analyses, substantiate the central finding that forced peripheral lysosome clustering correlates with more adhesive FA states and suppressed protrusive activity. We have clarified scope and limitations accordingly.

      Extract from Figure S____1____A-C,E,G: PLEKHG3 localizes to F-actin independently of lysosomal transport but is dispensable for protrusive activity. A) __Stills from live cell imaging (Movies S12-15). Cells stably expressing GFP or GFP-PLEKHG3 were transfected with the indicated mCherry constructs. Yellow arrows = forming protrusions; blue arrows = retracting protrusions. Stills were generated over a period of 2 hrs. Scale bar = 50 µm. __B) __Quantification of protrusions formed and retracted over time in cells from A. Values indicate average number of protrusions formed in a timespan of three hours from a total of ≥ 15 cells per condition. Error bars = SEM. C) Quantification of colocalization by Fijis Coloc2 Plugin (see materials and methods) over a timespan of three hours. Lines represent mean of all cells per condition, and light-color shading represents the SEM. __E) Stills from live cell imaging (Movies S18-21). PLEKHG3 WT and PLEKHG3 KO cells were transfected with the indicated mCherry constructs and incubated with LysoTracker. Yellow arrows = forming protrusions; blue arrows = retracting protrusions. Stills were generated over a period of 2 hrs. Scale bar = 50 µm. G) Quantification of protrusions formed and retracted over time in cells from E. Values indicate average number of protrusions formed in a timespan of one hour from a total of ≥ 15 cells per condition. Error bars = SEM. In B,G, black asterisks denote p values according to Kruskal-Wallis and Bonferroni post-hoc testing, comparing the effect of KIF1A against mCherry or PLEKHG3 WT against KO.__ __

      • It is not clear how the authors conclude that Figure 4E graph shows "the LAMP1 signal was stronger in paxillin-labeled FA compared to control regions". The 4E graph shows LAMP1 signal in GFP versus GFP-PLEKHG3 and shows a modest enrichment of LAMP1 in FAs in GFP-PLEKHG3 overexpression. LAMP1 enrichment in FAs is also not obvious in the image shown in Figure 4B.

      Answer: We stand corrected – the Figure we referred to was not in the manuscript. It has been inserted now, as a plot next to Figure. Figure 4B (schematic representation of colocalization analysis) was designed to explain how we define focal adhesions (paxillin positive) and adjacent control regions (same size and shape, but paxillin-negative). The actual analysis was missing and has now been inserted. We apologize for this mistake.

      We do not claim that PLEKHG3 brings lysosomes to FAs. The enrichment of lysosomes in FA regions of cells expressing GFP-PLEKHG3 compared to GFP-expressing cells shown in 4E, as the Reviewer correctly notes, is marginal and is not highlighted anywhere in the text exactly for this reason.

      • In Fig. 2B, there appears to be a labeling error. The lanes 2,4 and 7 appear to be transfected with L3-T-V5 but labeled as GFP-V5-cyto. Here the PLEKHG3 band should be indicated.

      • AND -Fig. 2C is an IP experiment as per the manuscript text but it is labeled as pulldown.

      Answer: We stand corrected, and the necessary changes have been made in the revised version in Figure 2B.

      Reviewer 2

      1 - Specificity of PLEKHG3 antibody: In Fig. S2, authors show that PLEKHG3 antibody recognizes 3 bands (above 100 kDa, above 130 kDa and 250 kDa) and all of them are reduced by the silencing of PLEKHG3. Then, in Fig. 2A and C, authors only show the band above 130 kDa, despite implying that the specific band should be "much higher than the 134 kDa calculated from the aminoacid sequence of the protein".

      In Fig. 2 B, they show all the bands shown in Fig. S2 and presumably favor that the specific band is the 250 kDa one. Finally, in Fig. 2D, they show all bands and note that the band above 130 kDa is not specific. Therefore, authors need to conclude what is the specific band and always analyze the same one, and, possibly, use a different antibody or purify this one to remove non-specific binding. Without this, the main result of the paper, cannot be substantiated.

      Answer: We apologize for this misunderstanding. The antibody recognizes three bands, all reduced by siRNA treatment. These three bands are only resolved in the gels in Figure S2A and B, and in Figure 2B. The reason for this is the high molecular weight of the isoforms, that are resolved in these 8% gels, but collapse into one band in the 15% gels shown in Figure 2A and C. Therefore, the high molecular weight bands are not resolved under these conditions. 8% gels such as the ones in Figure 2B are needed to resolve the high molecular weight bands.

      Figure 2D shows an 8% gel, and therefore all bands are visible. The band marked by an arrow is only present in the streptavidin pulldowns but not in the input or in the supernatant and is therefore considered unspecific. This has been clarified in the revised figure legend on page 41. In addition, to assess the specificity of the antibody for endogenous PLEKHG3 we have used HEK293T cells, which based on the fact that PLEKHG3 is most highly expressed in neuronal cells (https://www.proteinatlas.org/ENSG00000126822-PLEKHG3/tissue#expression_summary) should yield a clearer endogenous signal. The results of this experiment are shown in Figure S2B-C of the revised manuscript. The pattern of PLEKHG3-positive bands is similar to that observed in HeLa cells, and only the band around 250 kD is clearly reduced by the siPLEKHG3. The IF images show a selective loss of the PLEKHG3 signal in correspondence of actin filaments close to the plasma membrane, whereas the nuclear signal is preserved, and therefore to be considered non-specific. More importantly, we have now generated a PLEKHG3 CRISPR-Cas KO in HeLa cells. The results, shown in Figure S2G-I confirm the specificity of our reagents and the localization of PLEKHG3 seen in HEK293T cells. Loss of PLEKHG3 however, had no effect on lysosomal distribution or morphometric parameters compared to WT cells (Figure S4A-C).

      2 - In page 12, authors state that "These results indicated that PLEKHG3 is a transient interactor, or a proximal, not directly binding protein, of L3" and in page 14 that "... PLEKHG3 is a proximal L3 protein rather than a transient physical interactor". It is not clear at all how did the authors reach such conclusions, nor they have data to conclude this. Indeed, they would have to express the proteins in vitro and test their interaction to conclude about a direct binding. They also do not know what is the stability of the interaction.

      Answer: This is also a misunderstanding. We mislabeled Figure 2C as “pulldown”, rather than “IP”, as it is characterized in the text. We revised terminology to “vicinal (proximity-labeled) protein” throughout, avoiding claims on directness. Our basis is: robust L3 TurboID labeling of PLEKHG3; failure to co-immunoprecipitate PLEKHG3 with V5-tagged L3 (Fig. 2C); lack of PLEKHG3 labeling by TMEM192; and unchanged PLEKHG3 FA localization in L3 KO (Fig. S3H–J). Together, these support spatial proximity rather than a stable L3–PLEKHG3 complex. We explicitly state that we did not perform in vitro binding due to the negative co-IP.

      Based on these negative data, we did not proceed to test the possibility of complex formation in vitro.

      3 - Still in page 12, authors state that "... two different membrane structures, protrusions and ruffles". What do the authors mean exactly by "protrusions", as there are several different ones (e.g., lamellipodia, filopodia, pseudopods)? And how can they distinguish between ruffles and, for example, lamellipodia? They need to use markers and more carefully analyze their morphology to be able to distinguish these. Like this, it is too preliminary.

      Answer: It was our intention to indicate with the arrows the trajectories in the figure along which we measured the MFI of LAMP1 and PLEKHG3. Although this is indicated in the figure legend, it had apparently given the impression that the arrows indicated specific membrane structures. Since we are focusing on different types of membrane protrusions rather than ruffles, we replaced the ambiguous terms "ruffles" and "protrusions" with the terms "elongated protrusions" (Figure 3D upper panel) and then compared these with "non" elongated protrusions” (Figure 3D lower panel). Indeed, we note that PLEKHG3 accumulation is possible below and along the plasma membrane, but colocalization with lysosomes occurs preferentially in elongated protrusions. We therefore amended the text on page 26, line 4-9 as follows:

      „More specifically, we found that PLEKHG3 colocalized more strongly with LAMP1-positive vesicles in elongated membrane structures (Figure 3D-E). Focal adhesion sites, which anchor the intracellular cortical actin network to the extracellular matrix and are remodeled with the help of late endosomes/lysosomes during protrusion formation and cell motility, can also be found in such elongated membrane protrusions (reviewed in [58,59]).”

      5 - It is not clear if in cells KO for PLEKHG3, the overexpression of KIF1A leads to more lysosomes localizing close to the PM, as well as more protrusions and more cell motility, as the authors only compare cell overexpressing GFP or GFP-PLEKHGL3.

      Answer: We have now generated a PLEKHG3 KO cell line. In these cells, KIF1A still drives peripheral lysosome clustering and suppresses protrusive activity and actin dynamic (see revised Figure S4A-C displayed below). Baseline lysosome distribution and morphometric parameters are unchanged in KO cells (see revised Figure S6E-F displayed below).

      Extract from revised Figure S4A-C: ____PLEKHG3 KO does not influence lysosomal distribution or cell morphometry: A) __Quantification of lysosomal distribution in WT compared to two KO cell lines. N ≥ 50 cells in three biological replicates. __B) Schematic representation of analysis of cell shape descriptors as referred to in C). Left picture shows the calculated outline in yellow based on which the cell area and circularity are calculated. Right picture shows the minor and major cell axis which, calculated as fraction, result in the aspect ratio of the cell. Scale bar = 50 µm. C) Quantification of cell morphometric parameters Area, Circularity and Aspect ratio. N ≥ 50 cells in three biological replicates. Black dots represent mean of each biological replicate. Statistical analysis according to student’s t-test. Error bars = SEM.

      Extract from revised Figure S6E,G: ____PLEKHG3 localizes to F-actin independent of lysosomal transport but is neglectable for lysosomal effect on protrusive activity. E) Stills from live cell imaging (Movies S18-21). PLEKHG3 WT and PLEKHG3 KO cells were transfected with the indicated mCherry constructs and incubated with LysoTracker. Yellow arrows = forming protrusions; blue arrows = retracting protrusions. Stills were generated over a period of 2 hrs. Scale bar = 50 µm. G) Quantification of protrusions formed and retracted over time in cells from E. Values indicate average number of protrusions formed in a timespan of one hour from a total of ≥ 15 cells per condition. Error bars = SEM. In F,G, black asterisks denote p values according to Kruskal-Wallis and Bonferroni post-hoc testing, comparing the effect of KIF1A against mCherry or PLEKHG3 WT against KO.

      6 - Regarding the statistical analysis, authors assert that it was done using Student's t tests, unless otherwise stated. However, they never refer in figure legends other statistical analysis methods. If so, they cannot use such test, for example, in cases where more than two groups are compared.

      Answer: We clarified in Methods that we performed two-group comparisons unless otherwise stated. Where >2 groups are compared, we used appropriate tests with correction (e.g., Kruskal–Wallis with Bonferroni in Fig. S6B, S6G). Figure legends now explicitly state the test used.

      __Minor comments: __

      1 - In the abstract, authors refer that cytosolic proteins are recruited to platforms on the limiting membrane of lysosomes. What do they mean by "platforms"? Is it microdomains?

      Answer: We apologize for this lack of clarity and have now changed the first sentence in the abstract on page 1 to “Lysosomes are key organelles involved in metabolic signaling pathways through their ability to recruit cytosolic molecules to protein platforms bound to the lysosomal membrane”. We refer to protein platforms as multifunctional protein complexes that can recruit and assemble signaling components (e.g., the recruitment of mTORC1 activating proteins by the LAMTOR complex).

      2 - In the Introduction, there is a period before the reference at the end of the first paragraph.

      Answer: We stand corrected. See changes on page 18, line 9.

      3 - In the results, Fig. 1E is mentioned before Fig. 1D and Figure S1F before Fig S1E, which can be confusing.

      Answer: Figure S1E on page 6 was mislabeled as Figure 1E and Figure S1K on page 9 was mislabeled as Figure 1K. We stand corrected. See changes on page 21, line 21+23 and page 23, line 5.

      4 - All the immunofluorescence images need to be bigger, in general, and have zoom-ins, except Fig. 3A, 4B, 4F, and S2C. Also, in Fig. S1F, the green channel has different intensities and the V5-lyso signal is clearly saturated. Finally, Fig. S1D, S1I and S3F must be enlarged, too.

      Answer: We appreciate the Reviewer's suggestion, but enlarging all the immunofluorescence images and including zoom-ins would make the manuscript very crowded and could distract from the main findings. Regarding the expression levels of the baits, as mentioned in the manuscript, we aimed to express them at near-endogenous levels. However, TMEM192 is expressed at higher levels than LAMTOR3 in these cells, which may have resulted in the observed discrepancy. We hope the Reviewer will understand our decision and find the current presentation of the data clear and informative.

      5 - In page 9, where it reads "Figure 1K", should read "Figure S1K".

      Answer: See answer to minor point 3.

      6 - The observation that PLEKHG3 silencing leads to loss of the perinuclear clustering of LAMP1-positive vesicles, and increase in their accumulation at the cell tips, is not referred in the text.

      Answer: While this might seem the case in part of the cells shown in the representative image in Figure S2C, population-level analysis (n > 30 cells) did not support a shift in lysosome distribution with PLEKHG3 silencing.

      __Figure 1 for Reviewer 2: __Lysosomal distribution in HeLa cells transfected with either siScr or siPLEKHG3. X-axis is relative distance from the nucleus and Y-axis the normalized intensity of the LAMP1 channel. Results are averages of >30 cells from one experiment (only displayed in “final revision” document).

      Similar results were obtained using two independent PLEKHG3 KO cell lines, and are shown in Figure S4A

      __Extract from revised Figure S4A-C: PLEKHG3 KO does not influence lysosomal distribution or cell morphometry: A) __Quantification of lysosomal distribution in WT compared to two KO cell lines. N ≥ 50 cells in three biological replicates.

      7 - Fig. 2C is not referred in the legend.

      Answer: We stand corrected and have changed the legend of Figure 2 accordingly on page 41.

      8 - Figure S3A and B: authors should show the colocalization of endogenous PLEKHG3 with phalloidin and not only the GFP-tagged protein.

      Answer: We thank the Reviewer for this comment and have performed this experiment showing the colocalization of endogenous PLEKHG3 with F-actin structures stained by Phalloidin. Even though the endogenous PLEKHG3 staining in HeLa cells is rather weak, sites where membrane protrusions are formed are clearly marked with PLEKHG3 staining below the plasma membrane. These data confirm the specific colocalization of PLEKHG3 with Phalloidin shown in the revised Figure S3A. See also the extract from Figure S3A below.

      Extract from revised Figure S3A: Immunofluorescence images of HeLa cells. A) HeLa cells stained with PLEKHG3 (red) and Phalloidin (green). The nucleus is indicated by DAPI staining (blue). Scale bar = 50 µm. Insets on the right as indicated by white box in image on the left. Scale bar = 10 µm. Line plot corresponds to white line in merged inset.

      9 - In page 14, authors refer to Fig. 3G, which does not exist.

      Answer: We stand corrected, the sentence on page 14, line 9 (now page 26 line 24 in revised document) refers to Figure S3G.

      10 - In page 30 and page 32, different antibodies for LAMP1 and PLEKHG3 are mentioned, but in the figure legends authors do not refer which one they used.

      Answer: We tried different PLEKHG3 antibodies but ended up using only one. The other antibody has been excluded from the list on page 32, line 18 (now page 9, lines 4-5 in revised manuscript). We have specified which LAMP1 antibodies were used in which Figure in the Material and Methods on page 6, line 23 and page 7, line 4-5.

      11 - In page 33, where it reads "300 µm protein", it should probably read "300 µg protein".

      Answer: We stand corrected. See changes on page 10, line 2.


      Reviewer 3

      A key issue … is that the authors focus solely on peripheral lysosomes as target compartments for PLEKHG3. This is not self-evident, particularly in light of images presented in Figures 2 and 3, where colocalization of PLEKHG3 with perinulcear lysosomes appears very likely. The authors should make differences/similarities they observe between effects on perinuclear versus peripheral lysosomes explicit both with data and in the text, if such differences exist.

      Answer: The Reviewer is likely addressing the images in Figure 3, which were obtained by staining endogenous PLEKHG3 and do diffuse staining around the nucleus. This perinuclear haze is resistant to siPLEHG3 (revised Figure S2C) or to PLEKHG3 CRISPR-Cas9-mediated ablation (revised Figure S2H) and is not observed with the GFP-PLEKHG3 fusion protein (revised Figure S2E-F), which gives a less diffuse signal. This is why we are confident about the colocalization of PLEKHG3 with peripheral lysosomes.

      Data presented in Figure 6 showing cell motility analysis is interesting and has potential to make the manuscript impactful. Similarly, data in Figure 4F (live cell imaging) looks attractive but is not informative in the absence of relevant genetic perturbations as comparisons. These types of experiments would benefit greatly from PLEKHG3 loss of function analysis, as well as mutational analysis in the over-expression setting.

      Answer: We have now generated a PLEKHG3 KO cell line. Loss of PLEKHG3, however, had no effect on lysosomal distribution or morphometric parameters compared to WT cells, and it does not impact the suppression of protrusions/actin dynamics by KIF1A is preserved in KO, indicating PLEKHG3 is not required for this phenotype.

      Mutational analysis of PLEKHG3–LAMTOR binding is not feasible in the absence of co-IP or other direct binding evidence (see revised Figure S6E,G displayed in the answer above).

      Minor point: 1. Multicolor overlays with one of the channels in white is in my view not reader-friendly. Appreciating colocalization between endosomes/lysosomes, actin and G is very important for this study, and while is typically reserved to show overlay between green and magenta or green (standard for 2 channels), red and blue (standard for 3-channels). I therefore advise the authors to choose a different color combination throughout the figures when presenting microscopy images.

      Answer: White as a channel color has been substituted for with red (in the 2- and 3-color images) or with blue (in the 4-color images) throughout the images of the revised manuscript, except for the stills from the videos that have not been changed because no colocalization analysis has been performed in this case.

      1. Description of analyses that authors decided not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Reviewer 2

      4 - At least Fig. 2F and 3A need quantification. Regarding cell motility, there is no quantification and the authors must perform a quantitative assay (despite stating that "As another measure of cell motility, analysis of the number of forming protrusions and retracting membranes..."). Not only this is not a measure of cell motility, but there the issue of what are "protrusions" referred above. Therefore, authors need to quantify the distance that the cells move and/or perform quantitative motility/migration assays.

      Answer: We appreciate the Reviewer’s attention to detail and agree that the quantification of these figures is essential to understand the results. We believe that the Reviewer refers to Figure 3F and Figure 4A, as there is no Figure 2F, and Figure 3A only confirms the localization of endogenous PLEKHG3, as previously reported in (Nguyen et al., PNAS 2016). If our assumption is correct, then the salient aspects of Figure 3F, which is a representative image, are quantified

      • in Figure 3C-E (endogenous PLEKHG3 colocalization with LAMP1/lysosomes)
      • In Figure 4E and 5F-G (FA with LAMP1/lysosomes).
      • Figure 4A is quantified in Figure 4C-E (GFP-PLEKHG3 colocalization with FAs, labeled with paxillin in this experiment, and LAMP1 colocalization with FAs). In response to the Reviewer's comment regarding the absence of quantification for cell movement/migration in our study, we apologize for any confusion that may have arisen from our use of the term "cell motility." We have clarified usage to mean membrane remodeling dynamics integral to migration rather than net displacement. To avoid overclaiming, we removed statements that could imply directed migration and focused on protrusion/retraction metrics and shape changes. In this context, our statement that lysosomal subcellular localization plays a role in cell motility remains valid. The relationship between membrane protrusive activity and motility is evident from our observations in cells overexpressing KIF1A-mCherry, where both membrane remodeling/protrusive activity and movement are significantly impaired compared to control cells (refer to Movie S7 vs. S6 and S10 vs. S9).

      To address the Reviewer's concern, we have clarified our definition of motility in the introduction by stating on page 19, line 23 – page 20, line 2: "We demonstrate that PLEKHG3 colocalizes with lysosomes at focal adhesion and protrusion sites, and that the localization and function of this protein – and consequently, overall cell motility – are fundamentally dependent on lysosomal dynamics." This revision ensures that our results are accurately described and minimizes any potential confusion. Additionally, we have removed the statement on page 23, line 1 of the original manuscript. We apologize for any confusion our original wording may have caused and appreciate the opportunity to clarify our intentions.

      Reviewer 3

      1. The mechanism of PLEKHG3 action on lysosomes/late endosomes is underdeveloped in my view. In the absence of for instance mutational analyses to examine what drives the interaction of PLEKHG3 with LAMTOR3, as well as delineation of at least some molecular consequences of this binding, the study remains incomplete.

      Answer: We are grateful for the Reviewer's feedback and concur that gaining insight into the mechanistic details of PLEKHG3's interaction with LAMTOR3 would be beneficial. We now consistently refer to PLEKHG3 as a L3 vicinal protein based on TurboID and lack of co-IP. Because L3 KO does not alter PLEKHG3 FA localization and we find no evidence for a stable complex, mutational binding analyses lack a clear readout and are beyond the scope of this revision. We emphasize the conceptual advance—lysosome positioning gates PLEKHG3 cortical enrichment at FAs while peripheral lysosome clustering correlates with more adhesive, less protrusive behavior—and explicitly flag mechanistic questions (e.g., integrin turnover, IQGAP, Rac1 pools) as future work.

      We hope that the Reviewer will bear with us on this point, considering the novelty of our findings, which illuminate the interplay between lysosomes and actin dynamics as well as the role of PLEKHG3 in regulating cell protrusions—findings not previously reported in the literature.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Ettelt et al describes the identification of PLEKHG3 as a collaborator of the LAMTOR complex on lysosomes using proximity-based biotinylation. The biotinylation screen is well executed and controlled. The authors choose to follow up on PLEKHG3, a protein involved in actin dynamics, which they refer to as understudied (I let the validity of the latter statement to be evaluated by the editor). Generally speaking, the data are of good quality, and the manuscript is clear and well written. However, much of the evidence on the role of PLEKHG3 on lysosomes is suggestive at best and further investigation into its mechanisms of action is warranted. Some important points to address prior to publication are detailed below.

      Major Points:

      1. The mechanism of PLEKHG3 action on lysosomes/late endosomes is underdeveloped in my view. In the absence of for instance mutational analyses to examine what drives the interaction of PLEKHG3 with LAMTOR3, as well as delineation of at least some molecular consequences of this binding, the study remains incomplete.
      2. A key issue possibly (but not necessarily) related to the point above is that the authors focus solely on peripheral lysosomes as target compartments for PLEKHG3. This is not self-evident, particularly in light of images presented in Figures 2 and 3, where colocalization of PLEKHG3 with perinulcear lysosomes appears very likely. The authors should make differences/similarities they observe between effects on perinuclear versus peripheral lysosomes explicit both with data and in the text, if such differences exist.
      3. Data presented in Figure 6 showing cell motility analysis is interesting and has potential to make the manuscript impactful. Similarly, data in Figure 4F (live cell imaging) looks attractive but is not informative in the absence of relevant genetic perturbations as comparisons. These types of experiments would benefit greatly from PLEKHG3 loss of function analysis, as well as mutational analysis in the over-expression setting.

      Minor point

      1. Multicolor overlays with one of the channels in white is in my view not reader-friendly. Appreciating colocalization between endosomes/lysosomes, actin and G is very important for this study, and while is typically reserved to show overlay between green and magenta or green (standard for 2 channels), red and blue (standard for 3-channels). I therefore advise the authors to choose a different color combination throughout the figures when presenting microscopy images.

      Significance

      In principle, I consider this study to be of interest to the community of cell biologists working on the endolysosomal system and/or the actin cytoskeleton and its relationship to intracellular membranes. However, the authors find themselves in a rather crowded field. I feel that developing the mechanism of action of PLEKHG3 on lysosomes beyond this first submission could help with boosting the impact of the study. There is clearly something interesting going on, but what that is exactly, remains unclear in my view.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: The authors use proximity-dependent labelling and mass spectrometry to identify cytoplasmic proteins that interact with lysosomes. They show that PLEKHG3 interacts with the LAMTOR complex; that PLEKH3 accumulates in focal adhesion sites, where it colocalizes with peripheral lysosomes; and that the increased translocation of lysosomes to the periphery leads to less "protrusions", as well as rounder cells and less motile cells.

      Major comments: While the study is generally carefully performed and thorough, there are major shortcomings that affect the conclusions taken, namely the specificity of the PLEKHG3 antibody, the identification of "protrusions" and ruffles, several quantifications missing, and the data used to conclude about cell motility. There are also conclusions for which there is no concrete or solid evidence.

      Specific issues:

      1. Specificity of PLEKHG3 antibody: In Fig. S2, authors show that PLEKHG3 antibody recognizes 3 bands (above 100 kDa, above 130 kDa and 250 kDa) and all of them are reduced by the silencing of PLEKH3. Then, in Fig. 2A and C, authors only show the band above 130 kDa, despite implying that the specific band should be "much higher than the 134 kDa calculated from the aminoacid sequence of the protein". In Fig. 2 B, they show all the bands shown in Fig. S2 and presumably favor that the specific and is the 250 kDa one. Finally, in Fig. 2D, they show all bands and note that the band above 130 kDa is not specific. Therefore, authors need to conclude what is the specific band and always analyze the same one, and, possibly, use a different antibody or purify this one to remove non-specific binding. Without this, the main result of the paper, cannot be substantiated.
      2. In page 12, authors state that "These results indicated that PLEKHG3 is a transient interactor, or a proximal, not directly binding protein, of L3" and in page 14 that "... PLEKHG3 is a proximal L3 protein rather than a transient physical interactor". It is not clear at all how did the authors reach such conclusions, nor they have data to conclude this. Indeed, they would have to express the proteins in vitro and test their interaction to conclude about a direct binding. They also do not know what is the stability of the interaction.
      3. Still in page 12, authors state that "... two different membrane structures, protrusions and ruffles". What do the authors mean exactly by "protrusions", as there are several different ones (e.g., lamellipodia, filopodia, pseudopods)? And how can they distinguish between ruffles and, for example, lamellipodia? They need to use markers and more carefully analyze their morphology to be able to distinguish these. Like this, it is too preliminary.
      4. At least Fig. 2F and 3A need quantification. Regarding cell motility, there is no quantification and the authors must perform a quantitative assay (despite stating that "As another measure of cell motility, analysis of the number of forming protrusions and retracting membranes..."). Not only this is not a measure of cell motility, but there the issue of what are "protrusions" referred above. Therefore, authors need to quantify the distance that the cells move and/or perform quantitative motility/migration assays.
      5. It is not clear if in cells KO for PLEKHG3, the overexpression of KIF1A leads to more lysosomes localizing close to the PM, as well as more protrusions and more cell motility, as the authors only compare cell overexpressing GFP or GFP-PLEKHGL3.
      6. Regarding the statistical analysis, authors assert that it was done using Student's t tests, unless otherwise stated. However, they never refer in figure legends other statistical analysis methods. If so, they cannot use such test, for example, in cases where more than two groups are compared.

      Minor comments:

      1. In the abstract, authors refer that cytosolic proteins are recruited to platforms on the limiting membrane of lysosomes. What do they mean by "platforms"? Is it microdomains?
      2. In the Introduction, there is a period before the reference at the end of the first paragraph.
      3. In the results, Fig. 1E is mentioned before Fig. 1D and Figure S1F before Fig S1E, which can be confusing.
      4. All the immunofluorescence images need to be bigger, in general, and have zoom-ins, except Fig. 3A, 4B, 4F, and S2C. Also, in Fig. S1F, the green channel has different intensities and the V5-lyso signal is clearly saturated. Finally, Fig. S1D, S1I and S3F must be enlarged, too.
      5. In page 9, where it reads "Figure 1K", should read "Figure S1K".
      6. The observation that PLEKHG3 silencing leads to loss of the perinuclear clustering of LAMP1-positive vesicles, and increase in their accumulation at the cell tips, is not referred in the text.
      7. Fig. 2C is not referred in the legend.
      8. Figure S3A and B: authors should show the colocalization of endogenous PLEKHG3 with phalloidin and not only the GFP-tagged protein.
      9. In page 14, authors refer to Fig. 3G, which does not exist.
      10. In page 30 and page 32, different antibodies for LAMP1 and PLEKHG3 are mentioned, but in the figure legends authors do not refer which one they used.
      11. In page 33, where it reads "300 µm protein", it should probably read "300 µg protein".

      Significance

      The study provides evidence that lysosome positioning can affect cortical actin cytoskeleton dynamics, as well as cell shape and motility. Experiments are in general thorough and data subjected to quantification. However, there are fundamental conclusions that are preliminary at this stage, as some of the data is not yet solid enough. Therefore, it needs to be further strengthened to be considered for publication. In general, it reads well but the amount of abbreviations (e.g. in the case of the constructs) makes it somehow difficult to follow. The study will be interesting for the cell biology, membrane trafficking and cytoskeleton dynamics communities.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Ettelt et al., describes identification of Rho guanine nucleotide exchange factor- PLEKHG3 as one of the positive hits from a TurboID proximity-dependent labeling screen using LAMTOR3 (one of the subunits of the pentameric LAMTOR complex/Ragulator) as a bait protein. The authors find that PLEKHG3 colocalizes with lysosomes at focal adhesions and that peripheral clustering of lysosomes promotes PLEKHG3 localization near the plasma membrane, and also inhibits protrusion formation and cell motility. The experiments, particularly the Turbo ID proximity-dependent labeling screen, are well-executed, and the imaging data is aptly quantified. The manuscript explores an exciting question of how lysosome positioning regulates cortical actin dynamics and thereby cell motility.

      Major comments:

      • The colocalization of endogenous PLEKHG3 and LAMP1 as depicted in figures 3B and 3C (data from fixed cells) is not convincing. PLEKHG3 appears to be present on cortical actin structures as opposed to being colocalized with LAMP1 on lysosomes. The authors should also confirm the specificity of the PLEKHG3 antibody in immunofluorescence using control and PLEKHG3 siRNA in untransfected cells that have not been transfected with GFP-PLEKHG3 (as is shown in Fig. S2C). Numerous antibodies recognize the overexpressed protein but do not recognize the same protein at endogenous expression levels.

      Moreover, do the authors observe colocalization between GFP-PLEKHG3 and lysotracker in living cells? There is no apparent colocalization of PLEKHG3 and lysotracker in the movie S5. - The authors observe that GFP-PLEKHG3 is concentrated at the cell's periphery when KIF1A is overexpressed, whereas RUFY3 overexpression results in more cytosolic staining. To bolster their conclusion that a change in lysosomal positioning alters the subcellular localization of PLEKHG3, it is preferable to employ inducible techniques, such as the recently described "reversible association with motor proteins" (RAMP) (PMID: 31100061). The method is a rapid and reversible method for altering organelle positioning. It is still unknown whether PLEKHG3 is associated with lysosomes and mechanism of how positioning of lysosomes affects PLEKHG3 localization. - Similarly to the preceding point, the claim that "peripheral accumulation of lysosomes inhibits protrusion formation and limits cell motility" should be tested more rigorously using the RAMP method, preferably in living cells. Other approaches, such as overexpression/siRNA of Arl8b and other motor adaptors, such as SKIP/PLEKHM2, can be used to alter lysosome positioning and confirm this central findings of the manuscript. The authors could also consider including additional mechanistic data in order to comprehend how lysosome positioning controls cell motility. For instance, the RAMP approach could be employed to investigate cortical actin dynamics upon repositioning of lysosomes to the peripheral/perinuclear region. - It is not clear how the authors conclude that Figure 4E graph shows "the LAMP1 signal was stronger in paxillin-labeled FA compared to control regions". The 4E graph shows LAMP1 signal in GFP versus GFP-PLEKHG3 and shows a modest enrichment of LAMP1 in FAs in GFP-PLEKHG3 overexpression. LAMP1 enrichment in FAs is also not obvious in the image shown in Figure 4B. - In Fig. 2B, there appears to be a labeling error. The lanes 2,4 and 7 appear to be transfected with L3-T-V5 but labeled as GFP-V5-cyto. Here the PLEKHG3 band should be indicated. - Fig. 2C is an IP experiment as per the manuscript text but it is labeled as pulldown.

      Significance

      Using a TurboID proximity-dependent labelling screen, the authors identified an interesting subset of actin-remodeling proteins that interact with the lysosomal protein LAMTOR3. The authors further characterized one of these proteins, PLEKGH3, and found that lysosome positioning regulates PLEKGH3 localization, as well as plasma membrane protrusion formation and cell motility. This study suggests that lysosome peripheral accumulation could regulate cortical actin remodelling and consequently cell migration by regulating PLEKGH3 localization (although this is not tested in the manuscript). This study adds to the previous findings that microtubule-based transport of late endosomes regulate delivery of late endosomal LAMTOR proteins to the vicinity of focal adhesions, which in turn, regulate focal adhesion dynamics. The mechanism of how lysosomes can influence actin remodeling will be important in the context of cancer cell migration. My area of expertise is lysosome fusion and motility and I have limited expertise in regulation of actin dynamics and how Rho family members regulate actin remodeling.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors investigate the role of Transcription Termination Factor 2 (TTF2) in the regulation of mitotic transcription. Using siRNA-mediated knockdown in two distinct human cell types combined with nascent RNA labeling (EU pulse), the authors identify an unexpected role for TTF2 in the timing of RNA Polymerase I (Pol I) reactivation following mitosis. The study suggests that this temporal misregulation may have downstream consequences for nucleolar morphology and function in interphase. The manuscript is well-written, and the figures are of high quality and clearly presented.

      Major comments:

      1. A primary limitation of the current study is that it does not deeply explore the underlying mechanism of the observed phenomenon. To strengthen the claims, the following points should be addressed:

      1a. Directionality of Phenotype: In Page 9, the authors conclude that TTF2 depletion is linked to abnormal nucleolar organization during interphase. It remains unclear if this is a direct result of mitotic misregulation or an independent interphase effect. To distinguish between these possibilities, I suggest the following experiment: perform a mitotic shake-off early in the siRNA treatment (~24h), collect mitotic cells, and allow them to re-enter G1 to image for nascent RNAs and nucleolin. This would clarify if the mitotic defect precedes and causes the interphase morphology changes. Alternatively, the authors should state that their current study cannot distinguish between these two possibilities.

      1b. Secondary Effects: The long duration of siRNA treatment (48h) raises the possibility that TTF2 knockdown misregulates the expression of other Pol I regulatory factors, leading to secondary effects. This limitation should be explicitly acknowledged in the Discussion. 2. The term "significant" is used throughout the manuscript without accompanying statistical testing.

      2a. Please provide statistical analyses (e.g., p-values) for the average plots in Figures 1-3 to substantiate the findings.

      2b. Where statistics are not performed, the language should be softened to "notable" or "observed increase" rather than "significant." 3. siRNA knockdowns are generally supported by quantification. Please provide the percentage reduction of the target protein by quantifying the blots provided in the supplemental figures. 4. To rule out the possibility that the increased nucleolin signal observed after TTF2 KD is simply due to higher protein abundance, the authors should perform a western blot to confirm that total nucleolin protein levels remain unchanged upon TTF2 depletion.

      Minor comments:

      1. The abstract and discussion refer to the role of TTF2 as a "conserved" process. As the study only tested human cell lines, "conserved" is technically inaccurate (as it implies evolutionary comparison). I recommend using "general" or "cell-type independent."
      2. While the Methods section is detailed, the Results section would benefit from brief descriptions of the treatments to improve flow.

      Example revision (Page 4): "...we treated two distinct cell lines with control and TTF2-specific siRNAs for 48 hours, followed by a 30-minute EU pulse to label nascent RNAs. Click chemistry and Hoechst labeling enabled 2-color imaging of mitotic chromosomes and nascent RNA..." 3. The data generally agree across both cell types; however, the presence of clustered signals in HeLa metaphase chromosomes is a notable divergence. It would be beneficial to include speculation in the Discussion on whether this represents a failure to silence Pol I transcription or an even earlier reactivation, and what this implies about a cancer cell line.

      Significance

      General assessment:

      The study is strong in its use of two different cell systems, providing confidence that the observed effects are not cell-line-specific. The figures are beautifully presented and the writing is clear. The primary limitation is the lack of mechanistic depth regarding how TTF2 specifically interfaces with the Pol I machinery compared to its known roles with Pol II.

      Advance:

      This work reports a previously unrecognized role for TTF2 in the temporal control of Pol I reactivation. While TTF2 is well-known for its role in terminating transcription and facilitating Pol II release during mitosis, its specific influence on the nucleolar transcription cycle provides a new perspective on how cells transition out of the mitotic state.

      Audience:

      This research will be of interest to researchers in the fields of gene regulation, the cell cycle, and nucleolar biology. Because it touches on the fundamental process of how transcriptional machinery is reset after cell division, it has implications for the broader basic research community interested in epigenetic memory and cellular identity.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper by the Oliveira lab a new perspective on TTF2 function during mitosis is proposed. Known for its role to terminate transcription at mitotic onset, this paper further shows an exciting involvement of TTF2 to schedule timely rDNA transcription at mitotic exit. Moreover, this role is shown to have a clear importance in the structuration of nucleoli, since TTF2 depletion is associated with premature partial assembly of nucleoli on the mitotic chromosomes and, subsequently, to fragmented nucleoli in interphase. These conclusions, which are well supported by imaging data, are original, interesting and, in fact, largely unexpected.

      The paper is very simple in its execution, based on siRNA depletion of TTF2 and monitoring of transcription by imaging using EU incorporation and rRNA-FISH, as well as nucleoli morphology and dynamics using immunostainings. Yet, it is well executed and has no major caveats. However, the authors should consider the following:

      1. The Teves lab has shown that TBP is a key factor maintaining its binding at rDNA loci during mitosis, enabling a prompt reactivation of rRNA production in interphase (Kwan et al. RNA 2024). This paper should be discussed on the light of current findings.
      2. In relation to the previous comment, I would strongly recommend the authors to analyse TBP-depleted cells, ideally using the line generated in the Teves lab, to address whether delayed rDNA transcription after mitosis leads to delayed nucleoli structuration. This assay would allow them to further confirm their model.
      3. In addition, it would be important to test if in the absence of mitotic TBP, the depletion of TTF2 does also lead to mitotic transcription.

      Significance

      Strengths: originality of the observation and simplicity of the experimental setup

      Limitations: exclusively based on imaging data

      Advance: completely unanticipated observation

      Audience: general readers interested in gene regulation

      My expertise: gene regulation through mitosis

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2026-03407

      Corresponding author(s): Laura Cantini, Julio Saez-Rodriguez

      [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We thank both reviewers for their thorough and constructive evaluation of our manuscript.

      Reviewer 1 highlighted that the manuscript would benefit from 1) a stronger positioning of ReCoN within the existing literature on multicellular modelling and network exploration, 2) a justification of our methodological choices, including the use of Random Walk with Restart (RWR), 3) the choice of input datasets for GRN inference and an assessment of the robustness of ReCoN's predictions to noise in these networks, 4) a more systematic exploration of ReCoN's parameter space (restart probability, layer transition probabilities, filtering thresholds).

      Reviewer 2 raised concerns about 1) the generalisability of the α parameter value (by default, 0.8) across independent datasets, 2) the expected contribution of the indirect effect in prediction performances, 3) the robustness of GRN across datasets and systems, and 4) the need for more quantitative validation in the spatial/microenvironment showcase. They also pointed out an unsupported claim regarding gene knockout prediction in the abstract.

      Several clarifications on figures, methods, and writing were also requested by both reviewers.

      As the main addition to the manuscript, we propose a new showcase based on the recently published Human Cytokine Dictionary (Oesinghaus et al., 2025). This showcase will simultaneously address several reviewer concerns by allowing us to 1) test the robustness and performance of α = 0.8 in an independent dataset, 2) evaluate the impact of different GRN inference methods (HuMMuS, SCENIC+, CellOracle, GRNBoost2) and noise on ReCoN's predictions..

      We will conduct a systematic parameter exploration on the Heart Atlas showcase, covering restart probability and inter-layer transition probabilities. We will additionally strengthen the validation of the microenvironment showcase by providing additional comparison to matched single-cell fibroblast data.

      Regarding the manuscript, we will substantially expand the discussion to better contextualise ReCoN within existing multicellular modelling approaches and the methods to justify our methodological choices (RWR/MultiXrank, dataset selection). We will remove the unsupported gene knockout claim from the abstract and reframe it as a future direction. In addition, we will clarify the distinction between ReCoN variants and rename them for clarity in the results section 1.2., improve figure legends. Finally, we will also work on the tool's documentation, including new tutorials on using spatial data and on running ReCoN with scRNA-seq-only GRN inference.

      We believe these revisions will substantially strengthen the manuscript and address the reviewers' concerns regarding method's robustness, generalisation, and contextualisation.

      2. Description of the planned revisions

      Reviewers' comments are in blue

      Authors' answers are in black

      Proposed text modifications are in green

      Reviewer #1

      R1.1. This is a very well-written paper; the methods used are adequate, and the use cases are relevant and broad, exploiting state-of-the-art datasets and tools.

      The author's claims are mostly justified. The authors could make an effort to more explicitly cite other efforts in similar directions. The claim 'We envision ReCoN as an extension to prior multicellular modelling, offering an interesting compromise between prediction of cell type responses and understanding of their molecular coordination.' is very general and could be better substantiated. In fact, the authors do not really give examples of alternative approaches to study systems of interacting cells, other than mechanistic agent-based models, which are clearly very different.

      Response:

      We thank the reviewer for pointing out the lack of contextualisation for ReCoN in this closing discussion.

      We wanted to remind that ReCoN builds notably on multicellular factor decomposition methods. We also want to emphasise the interest in completing cell communication methods that describe the big picture in multicellular interactions.

      • *

      We proposed to *explicitly state these two points with such rephrasing: *

      • *

      Network-based representations of multicellular systems have been an active field for many years, from early conceptual cytokine networks (Frankenstein, Alon, and Cohen 2006) to curated ligand-receptor cascades of hematopoietic tissue (Kirouac et al. 2010, Qiao et al. 2014). In parallel, and from bulk RNA-seq, the consideration of tissue specificities in GRN inference has been another way to consider the importance of the context in molecular mechanisms reconstruction (Sonawane et al. 2017). Single-cell analysis allowed decomposing tissue composition and quantifying gene expression, opening the possibility of scaling the inference of these networks and the inference of multicellular mechanisms in general, to large sets of molecules. Several methods have been developed to recover multicellularity. A first direction extends ligand-receptor interaction inference into the receiver cell response through curated signalling cascades, yielding ligand to target cascades (Browaeys, Saelens, and Saeys 2020, Jin et al. 2021, Zhang et al. 2021, Yan et al. 2025). A second direction leverages spatial context through explainable multi-view models that decompose marker variation in both intra- and intercellular contributions (Arnol et al. 2019, Tanevski et al. 2022), without considering the mediating cascades. Finally, the more recent family of multicellular factor decomposition methods focuses on the coordinated aspect of cellular programs rather than on the mechanisms. ReCoN's methodology proposes a network-based approach based on single-cell data and the philosophy of this last group of methods. Indeed, ReCoN aims to retrieve links between molecular drivers and such coordinated multicellular programs by bridging and exploring CCC inference and GRN modelling (Badia-i-Mompel et al. 2023) within large and coherent heterogeneous multilayer network.

      Arnol D, Schapiro D, Bodenmiller B et al. Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis. Cell Rep 2019;29(1):202-211.e6. https://doi.org/10.1016/j.celrep.2019.08.077.

      Badia-i-Mompel P, Casals-Franch R, Wessels L et al. Comparison and evaluation of methods to infer gene regulatory networks from multimodal single-cell data. Preprint, bioRxiv, 21 Dec. 2024, 2024.12.20.629764. https://doi.org/10.1101/2024.12.20.629764.

      Badia-i-Mompel P, Wessels L, Müller-Dott S et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023;24(11):739-54. https://doi.org/10.1038/s41576-023-00618-5.

      Browaeys R, Saelens W, Saeys Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 2020;17(2):159-62. https://doi.org/10.1038/s41592-019-0667-5.

      Frankenstein Z, Alon U, Cohen IR. The immune-body cytokine network defines a social architecture of cell interactions. Biol Direct 2006;1(1):32. https://doi.org/10.1186/1745-6150-1-32.

      Jin S, Guerrero-Juarez CF, Zhang L et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun 2021;12(1):1088. https://doi.org/10.1038/s41467-021-21246-9.

      Kirouac DC, Ito C, Csaszar E et al. Dynamic interaction networks in a hierarchically organized tissue. Mol Syst Biol 2010;6(1):MSB201071. https://doi.org/10.1038/msb.2010.71.

      Oesinghaus L, Becker S, Vornholz L et al. A single-cell cytokine dictionary of human peripheral blood. Preprint, bioRxiv, 15 Dec. 2025, 2025.12.12.693897. https://doi.org/10.64898/2025.12.12.693897.

      Qiao W, Wang W, Laurenti E et al. Intercellular network structure and regulatory motifs in the human hematopoietic system. Mol Syst Biol 2014;10(7):MSB145141. https://doi.org/10.15252/msb.20145141.

      Radig J, Droit R, Doncevic D et al. Tracking biological hallucinations in single-cell perturbation predictions using scArchon, a comprehensive benchmarking platform. Preprint, bioRxiv, 27 June 2025, 2025.06.23.661046. https://doi.org/10.1101/2025.06.23.661046.

      Sonawane AR, Platig J, Fagny M et al. Understanding Tissue-Specific Gene Regulation. Cell Rep 2017;21(4):1077-88. https://doi.org/10.1016/j.celrep.2017.10.001.

      Tanevski J, Flores ROR, Gabor A et al. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol 2022;23(1):97. https://doi.org/10.1186/s13059-022-02663-5.

      Yan L, Cheng J, Nie Q et al. Dissecting multilayer cell-cell communications with signaling feedback loops from spatial transcriptomics data. Genome Res published online 12 May 2025. https://doi.org/10.1101/gr.279857.124.

      Zhang Y, Liu T, Hu X et al. CellCall: integrating paired ligand-receptor and transcription factor activities for cell-cell communication. Nucleic Acids Res 2021;49(15):8520-34. https://doi.org/10.1093/nar/gkab638.

      R1.2. Moreover, the exploration of the multilayer networks with RWR is a very reasonable choice but could there be other approaches? I think the authors could discuss this issue to briefly support their choice of this method.

      Response:

      It is a very relevant comment, as this choice has not been discussed in the paper; we propose extending the method section about ReCoN's networks exploration with a justification about this choice.

      • *

      There is currently a limited set of network exploration methods that have been implemented for multilayer networks. It includes notably pymnet (Nurmi et al., 2024), natively adapted to heterogenous multilayer networks, and multinet (Bagavathi et al., 2019) and muxviz (De Domenico et al., 2015), initially developed for multiplexed networks (e.g. social network where the same set of nodes is present in each layer) but adaptable to more complex multilayer networks. However, to our knowledge, only MultiXrank proposes a robust measurement of proximity between each pair of nodes.

      Indeed, pymnet does not propose implementation for pairwise distance, similarly for muxViz, which focuses on community and motif detection. Multi-net does propose pairwise distance based on shortest paths, but implements it only for nodes of the same multiplex (e.g. in our network, it would only be two genes, or two receptors, respectively). https://www.rdocumentation.org/packages/multinet/versions/4.3.2/topics/multinet.distance

      • *

      We provide the additional justification for choosing RWR and MultiXrank over a reimplementation of another method or an extension of another method.

      • *

      • The total complexity of the RWR is O(δm) - when the number of nodes is negligible compared to the number of edges, with m the number of edges and δ the number of iterations in the walk (Baptista et al., 2022 - Supp Notes 2.A; Jin W. et al, 2019). This linear increase with the number of edges is particularly interesting for large networks, such as ReCoN ones that can contain several million* edges. The number of iteration δ and the computational time increases inversely to the restart probability, which is an important factor to keep this probability high. *

      • *

      • *MultiXrank is particularly interesting for its flexibility as it allows to easily attribute different weights to the different layers and to precise the direction of the exploration easily. *

      • *

      • It also produces deterministic results by prolonging exploration until convergence.

      • *

      • Additionally, in the context of ReCoN, the indirect effect of each cell is run independently. We previously extended the implementation of multiXrank for running RWR in parallel in a previous work (Trimbour et al., 2024), making it already adapted for optimising ReCoN's explorations.

      • *

      For all these reasons MultiXRank implementation seemed to be the best choice for robust and efficient exploration of ReCoN's HMLN.

      • *

      Bagavathi, A., Krishnan, S. (2019). Multi-Net: A Scalable Multiplex Network Embedding Framework. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 813. Springer, Cham. https://doi.org/10.1007/978-3-030-05414-4_10

      Manlio De Domenico, Mason A. Porter, Alex Arenas, MuxViz: a tool for multilayer analysis and visualization of networks, Journal of Complex Networks, Volume 3, Issue 2, June 2015, Pages 159-176, https://doi.org/10.1093/comnet/cnu038

      Nurmi et al., (2024). pymnet: A Python Library for Multilayer Networks. Journal of Open Source Software, 9(99), 6930, https://doi.org/10.21105/joss.06930

      Jin, Woojeong, Jinhong Jung, and U. Kang. "Supervised and extended restart in random walks for ranking and link prediction in networks." PloS one 14.3 (2019): e0213857

      R1.3. Generally the discussion should provide the reader the context in the existing literature in which the work can be set, detailing its impact. I think this could be improved.

      Response:

      • *

      We hope that the correction on the context proposed for comment R1.1 offers a first clarification on the context in the literature.

      • *

      We also propose to extend the description of ReCoN's impact with the following sentences in the discussion: "Unlike purely data-driven approaches, ReCoN contextualizes prior knowledge balancing both robustness through literature data, and specificity through new measurements. This mechanistic approach opens new possibilities for understanding how cellular coordination shapes tissue-level responses and for designing targeted molecular interventions."

      • *

      R1.4. Regarding the choice of datasets, it is clear that the method is quite demanding, requiring single cell and different omics to build the model, in addition to the expression dataset that is used as a use case. This inevitably leads to using a mix of datasets.

      For example in the mouse experiments the gene regulatory network was inferred from both a lymph node scRNA-seq dataset and a splenic scATAC-seq dataset, presumably due to the lack of multiome data in this setting. However the cell-cell communication network was inferred from the control case of the Immune Dictionary. Why can't the authors use the control data also for inferring GRNs?

      Is atac-seq really necessary in the inference of the GRN? What is the impact of the fact that lymph node and spleen samples might be different?

      :

      • *

      Is it a very *interesting comment, and we propose to add both 1) an explanation about our dataset choice to generate the GRN as a Supplementary text, and 2) a new experiment about the effect of GRNs built from multi-omics and scRNA-seq alone. *

      • *

      • Dataset choice

      • *

      We decided to infer a GRN using multiomics data, as these methods seem to perform better and are becoming the state of the art (Badia-i-Mompel et al. 2023, Trimbour, Deutschmann, and Cantini 2024, Yuan and Duren 2025).

      As scATAC-seq data was not produced for the Mouse Immune dictionary, we tried to find an external dataset, used HuMMuS, the method we previously developed, as it is also based on RWR and performs well on unpaired data.

      • *

      scATAC-seq

      Our first criteria was to match the mouse model used in the immune dictionary dataset, which reduced importantly the number of multicellular immune cell datasets available. We extended our research to a splenic dataset, as spleen is itself classified as a high specialised lymphatic structure, (check) and contains notably the same cell types than classical lymph nodes.

      • *

      scRNA-seq

      While we could technically use the control mice of the Immune Dictionary single-cel RNA-seq data with the spleen scATAC-seq data, the Immune Dictionary only provides 100 or less cells for each cell types per stimulation, which would results in a low number of cells. As GRN quality seems to depend a lot on the number of cell used, we favoured choosing a larger dataset.

      • *

      Our choice to use single-cell multiomics methods was driven by the novelty of these methods over scRNA-seq based ones, the performance improvement that they seemed to offer in several benchmarkings, and the will of developing a pipeline integrating the most complete data available for contextualization (Badia-i-Mompel et al. 2024).

      • *

      • GRN impact over the Human Immune Dictionary

      • *

      While it does not relate directly to this showcase, we will also add a new dataset analysis, detailed in the the comment R1.12. In the Human Cytokine Dictionary showcase,, we propose exploring the effect of choosing different GRNs, built from external multi-omics data or from the control scRNA-seq data of the dataset itself. We hope it can partially help users to decide in general wether to use external datasets of higher quality or sample-specific datasets.

      • *

      Finally, we propose to add in the documentation of the tool, a section showing how to use ReCoN with only scRNA-seq for the GRN inference, and the performance of different GRNs for the Human Cytokine Dictionary dataset directly in the paper.

      • *

      R1.5. The code is very clear, we were able to install and run it and it is quite well-documented. However, a few more details should be given in the text regarding how the evaluation of the performance is carried out.

      For example: If I understand correctly, when predicting the impact of cytokine perturbations the ReCoN predictions of genes impacted are compared to differentially expressed genes identified through traditional DEG analysis. What is compared is the ranking of these genes from ReCoN with the ranking provided by DEseq2. There is no description of how this comparison of ranking gives rise to AUROC values. Also, is it just the ranking that is predicted or can they also estimate how well they can predict the effect size?

      Response:

      • *

      We are thankful for pointing out the unclear technical details. DEG results were binarised, to obtain the list of differentially genes using the thresholds indicated in the section 4.4.4. We considered a gene as perturbed in each cytokine treatment if the comparison of control and treated cells had a t-test p-value below 0.1 and if the log-fold change was above 1.

      • *

      The second, and more general point of the reviewers, ReCoN scores should be considered to provide ranking on the possible regulations, but cannot be considered proportional to the effect size. As they are represent a likelihood more than a score, the binarisation should be the most appropriate transformation for the validation

      • *

      *Moreover, as the scores can be seen as the probability to end up the exploration on each node, they are always summing to one. This also prevents interpreting the scores as the amplitude of change. As an illustration example: if a receptor regulates three genes identically, they would (hopefully) all be having a score of (1 - R)/3, R being the restart probability in ReCoN, whether their expression doubles or is multiplied by 10. *

      • *

      While it can legitimately be seen as a downside, we believe it is similar in practice to most methods inferring GRN methods in practice, where trying to predict the true amplitude of gene perturbations usually results in very low performances (Badia-i-Mompel et al. 2024).

      • *

      We propose changes related to this comment.

      • *

      • We would modify the section 4.4.4. of the method with the following paragraph to explicit that it consists in a binary selection: "For each cytokine-cell type pair, differentially expressed genes were binarised: genes passing the significance thresholds (FDR P-val 1) were labelled as positives, and all remaining genes as negatives. ReCoN scores were then used to rank all genes, and AUROC values were computed from this ranking against the binary labels."

      • *

      • We will also include a section "ReCoN scores interpretation" on the documentation website, as score interpretation precisions will be particularly useful for users.

        R1.6. When describing the use cases, I think a bit more detail would help.

      For example 'To identify the cell-type-specific genes associated with HF, we used the MOFAcell scores of the multicellular factor 1 (MCP1) reported in ReHeat236' I supposed the explanation is on the dataset but for the sake of clarity it would be good to expand this sentence to give at least an idea of the approach.

      Response:

      • *

      We completely agree that more explanations should be provided, to avoid for the reader having to switching between articles to understand the concepts behind this showcase. As suggested by the reviewer, we propose a general description of the approach with the short paragraph, and to remove the term "loading":

      • *

      "In the ReHeat2 study, the first multicellular factor (MCP1) was associated with heart failure. We used the gene loadings of MCP1 as a proxy for the cell-type-specific transcriptomic changes associated with heart failure, ranking genes by their absolute loading values."

      • *

      We also propose to complete the method section: "MOFAcell is a multicellular factor analysis method that decomposes multi-sample single-cell data into latent factors representing coordinated gene expression patterns across cell types. Each factor is characterised by cell-type-specific gene scores, reflecting their individual contribution to the coordinated program. In this showcase, we use the first multicellular program (MCP1), as it was associated with heart failure"

      R1.7. Regarding the calculation of the R matrix from the NichNet matrices L and G, I gather that the R matrix is calculated once and is thus fully data-independent and available just like the L and G matrices from NichNet. This was not very clear in the tutorials.

      Response:

      • *

      We are very thankful for the reviewers' involvement in testing the tools itself and its documentation. First, we propose a new website page explaining the pre-computed resources available for receptor - gene links, and added a descriptive paragraph in the tutorial themselves.

      *Second, we notice a typo in the equation, where it should actually be L = R * G with the current definition. We corrected it in the next version, and precised that R is fully data independent and solely inferred from prior knowledge. *

      R1.8. Also, this might just be a typo in the tutorial: 'The default α = 0.8 gives more weight to direct effects, which has been empirically validated. You can adjust this based on your biological question." I believe the manuscript says alpha>0.5 refers to indirect effects dominating.

      Response:

      • *

      We corrected the saying in the tutorials. Indeed, a high alpha represents a stronger indirect effect. Additionally, a similar typo was in the first equation of the paper, we are correcting it too.

      R1.9. Same for the pre-processing of the spatial data for the third use case, a little more details on how this was done would help the users and readers.

      Response:

      • *

      We propose adding a specific section about the spatial pre-processing and analysis in the methods.

      We are also adding a tutorial on spatial data. Since spatial data processing is computationally intensive without GPUs, we will also provide the data already processed, in order to allow anyone to test this tutorial too.

      • *

      R1.10. I don't see issues with the statistical power of the analysis.

      Rather, I think the authors should provide some examination of the parameter space for their model. Whereas ana analysis of the impact of the Alpha parameter is provided, I believe there are several more parameters that have a crucial impact and choices for their values should be discussed.

      For example 'In the GRN reconstruction only the links with a score above 1.5e-7 were retained in ReCoN's gene regulatory layer. How was this chosen?

      We have identified the following parameters that are somehow justified but could be explored to have a better feel for how they impact the results

      Restart probability: How often the walker goes back to the starting seed/molecule

      Layer transition probability: How often the walker stays in the same layer - different cell? - different layers? Gamma

      Node transition within a layer: How often one jumps to a different layer

      Response:

      This is a very valid point raised by the reviewer about parameters explorations.

      • *

      We focused on exploring the alpha (direct/indirect effect) parameter, as its value was the incertitude when designing the model.

      • *

      We would like to address this comment by adding new explorations for the restart probability and the transition probability between layers. The probability to transition between specific nodes inside a layer directly depends itself on 1) the restart probability, 2) the transition probabilities, and 3) the weights of the edges, that are determined before and independently to ReCoN's exploration.

      • *

      The Heart Atlas showcase allows to evaluate each set of parameters in around 10 min instead of 10h for the Immune Dictionary. We thus propose to evaluate restart probability and layer transition probabilities on the data of this showcase.

      • *

      • We would explore the restart probability of 0.1 * N, with N between 1 and 9.

      • *

      • For transitions probabilities we propose varying GRN, receptor, and cell communication importance with the following configurations: - Staying in CCC probabilities (- not jumping to receptor layer) among (0.1, 0.3, 0.5, 0.7, 0.9), staying in receptor layer (- not jumping to GRN) of (0.25, 0.5, 0.75), staying in GRN layer (- not jumping to CCC) of (0.25, 0.5, 0.75). It would result in 9 intracellular variations combined with 5 intercellular variations.

      • *

      We envision an evaluation by measuring the correlation between the results of the different configurations, and the time before convergence of the results, as it could potentially increase drastically when decreasing the restart probability. If correlations below 0.9 are observed between some results, we will compare their absolute performances.

      • *

      We would include the figures related to these explorations in the supplementary data. We would highlight the main findings in the method section dedicated to the random walk with restart. Finally, we would briefly describe the parameter exploration design in the first section of the results, for curious readers who would like to verify parameter choice before reading the showcases.

      • *

      R1.11. Weighting parameters: How much weight for direct or indirect effect to account for the combined effect - alpha - this is the only one that is explicitly explored.

      Response:

      We are very thankful for this comment, and we decided to modify our tutorial guidelines to make this choice more intuitive and general.

      • *

      Indeed, 1.5e-7 would hardly make sense for most methods, which would not produce such low scores. We now propose to select the first 2 million connections of GRNs, in order to keep a complete or a large portion of the network if other methods than HuMMuS are applied.

      • *

      In our case, 1.5e-7 was empirically determined from the distribution of HuMMuS scores, to keep the 2 million top connections as HuMMuS networks are generally almost fully connected, which is a particularity for classical GRN inference methods, and keeping it entirely would make exploration time much longer.

      • *

      R1.12. Finally, this might be considered OPTIONAL but would greatly improve the work in our opinion:

      The method crucially depends on the networks that are used in the different layers and to connect layers and cell types. As we know, biological data is noisy and incomplete (FP and FN) at each level and in each datatype. It would be really useful to estimate what is the robustness of the results to this noise. Particularly, from personal experience, we think the GRNs reconstructed from data are often almost fully connected and it is exceedingly difficult to validate them in specific contexts. This means that some 'errors' are likely to be present.

      Since several methods exist for inferring GRNs one could simply compare the results using different methods for this part of the network.

      A related point involves the characteristics of the RWR algorithm, that will be quite impacted by the presence of hubs in these networks (either in single layers or across several) that is likely to impact the exploration. If proteins that are hub are effectively important, that is not a problem, but in some layers, for example, the receptor-receptor layer that presumably will contain PPIs, there might be biases in hubs being just better studied proteins, and these hubs might have an 'unjustified' weight in the walks.

      One potential approach to assess the robustness of the method to these issues could be an empirical one that just randomly perturbs the networks in ReCoN to see to what extent similar predictions are achieved.

      *Response: *

      • *

      We are thankful for this relevant comment on GRN and prediction stability, and would like to take it as an opportunity to support the hypothesis that different GRN methods can be used in ReCoN.

      • *

      When developing our previous HMLN-based tool, HuMMuS (Trimbour et al. 2024 - Supp Figure 6), we observed that its multilayer structure provided more robust results than individual layers. We would like to reproduce such an analysis, verifying that ReCoN results have less variability than the GRN layers individually.

      We propose to integrate a new showcase on the Human Cytokine Dictionary (Oesinghaus et al. 2025), trying to predict cytokine downstream effects similarly to the Mouse Immune Dictionary showcase.

      This showcase would be useful to confirm the contribution of the indirect effect and test the impact of different GRN on the results.

      We would generate different GRN with several other GRNs methods: SCENIC+, CellOracle, and GRNBoost2 - the latest using only the scRNA-seq of the control samples in the Human Cytokine Dictionary.

      • *

      The GRN methods produce generally output with very low overlap (Badia-i-Mompel et al. 2024)*. *

      *If we observe high correlations between the ReCoN predictions associated with the different GRNS, it would provide already a validation of ReCoN's robustness to GRN noise. *

      If lower correlations between ReCoN's predictions are obtained, we will add a specific permutation experience over the HuMMuS GRN, creating different level of artificial noise and assessing more precisely the robustness of ReCoN to GRN stochasticity.

      • *

      Regarding PPI hub justification, our *applications did not use receptor PPI and are not affected by bias at this level in the showcases. This bias could specifically be present in the receptor-gene links, as we derive it from the ligand-gene connections of Nichenet which was itself partially based on prior knowledge. It is thus possible that some receptor are reached more often due to this bias and not a stronger effect. It seems however, hard to control in this context, as ReCoN currently relies on this prior knowledge. Currently, we hope that the combination of personalised, literature-agnostic GRN with literature-based receptor - gene can provide an interesting trade-off. In future development, we could imagine a receptor-gene network based solely on perturbations, but it would require controlling also the bias of ligand - receptor binding couples, which limits even the use of ligand-based experience. *

      We propose adding a short point in the discussion about hub effects from RWR-based methods.

      • *

      R1.13. Please add page numbers.

      *Response: *

      • *

      We will add the page numbers.

      • *

      R1.14. Figures are nice and clear.

      Some specific minor points are listed here below.

      Define hMLN on first appearance fig1 caption (no page numbers..

      2nd appearance heterogeneous multilayer structure (HMLN) ...

      Response:

      • *

      We updated the legend of the figure to include the definition of the acronym, as it arrives before first text occurrence. (Or define at both positions ?)

      R1.15. Bi_j not so clear to what it refers when first mentioned

      Response:

      • *

      *Bi_j represents a weight that can be attributed to favour some cell-to-cell transitions. It is usually not necessary to use them.

      *

      *It is of interest notably to model 1) known spatial patterns in situ and hypothesis/design where cell types favour some connections. *

      • *

      E.g.: for modelling the skin, a user might notably want to increase connections between epidermic and dermic cells, and between dermic and hypodermic cells.

      • *

      We propose a new explanation of Bi_j to both explain it's meaning in the modelling, and illustrates situations for using it: "The coefficient B_{i,j} modulates the influence of cell type i on cell type j in the indirect effect computation. By default, all B_{i,j} are set to one, weighting each cell type's contribution equally per cell. However, it can be adjusted to encode additional biological knowledge, such as spatial proximity between cell types or known cooperation patterns. For instance, when modelling the skin, a user might increase B_{i,j} between epidermal and dermal cells, and between dermal and hypodermal cells, to reflect their spatial organisation."

      R1.16. personalized interaction specificity. - maybe better word than personalised (contextualised?)

      Response:

      • *

      We agree that contextualised explicits better the meaning behind this model. Personalised might notably lead to expect patient-specific data, which is not the case here.

      • *

      We propose to rephrase all the model names to : Receptor-matrix, ReCoN-no-CCC, ReCoN-no-context, ReCoN-complete.

      R1.17. ReCoN-genetic and ReCoN, ( generic?)

      Response:

      • *

      We will correct this typo.

      R1.18. responses. It is expected to observe common behaviors in-between cell-type, that the GRN

      and the generic CCC network already contribute captures.

      • not very clear

      Response:

      • *

      We aimed here to provide an explanation to the already good performance of the "ReCoN-no-context" (or its name updated according to comment R1.16), which could be surprising as no cell-type specific information is used. The explanation proposed is the good prediction of several properties shared by all immune cell types, such as similar metabolic pathways, despite their specific roles. If we adopt a quantitative view on their transcriptome like in this showcase, it can be expected that the cell type responses are relatively well predicted through the common properties only.

      • *

      As this is a very relevant comment, and that several comments pre-submission we received were also related to this result, we would like to keep an explanatory sentence.

      • *

      R1.19. Figure 2b the icon of cells with double arrows might suggest phenotype shift when instead this is just communication

      Response:

      (left side) We are very thankful for paying attention to the details of the paper and fully agree with this analysis. We propose to represent ligand emission instead of arrows, reusing the convention of the Figure 1.

      R1.20. eTACs explain acronym and what they are

      Response:

      • *

      We update the first occurrence of eTACS to extrathymic Aire-expressing cells (eTACS).

      R1.21. Due to very few genes being differentially

      expressed, only cDC1 was conserved and evaluated for IL22,

      Not so clear

      Response:

      • *

      As we are commenting on IL22 stimulation results, we reorganised the sentence to make it less convoluted: "For IL22 stimulation, only cDC1 presented enough genes being differentially expressed."

      R1.22. In this showcase (not very clear, use case?)

      Response:

      • *

      We perceive "use case" as describing a type of use for the method, while a show case is a specific example of a use case. We thus find showcase more appropriate here. We will however go over all use of the word, to be sure it is only used for the precise examples we provided, and not to describe "use cases".

      R1.23. different fibroblast specializations - maybe phenotypes?

      Response:

      • *

        • It is a very good suggestion, as specialisation would involve functional aspects (that we can't really be sure of), and a chronological evolution*
      • Phenotype generally includes numerous properties, such as morphology, that we cannot validate here. We think the use of phenotype might be stronger than specialisation here. To simplify, phenotype can work, to be more precise: transcriptomic specialisation? I am honestly not sure of the best change here.

      R1.24. Figure 4b

      1. b) Schematic view of the deconvolution process and cell type-specific count inference from the spatial niches.

      Not so clear what the heatmap shows, rows and columns

      Spots heatmap : label niche on rectangles in cols

      And each col is a spot

      Rows are cell types or cells?

      In the cell types x spot

      Response:

      This figure can indeed benefit strongly from legend modifications. On both matrix, lines represent the genes, while columns represent the spot / individual cells deconvoluted per spots

      • *

      • We would annotate the niche legend (here the colour surroundings) by a symbolic drawing instead of writing it on the matrix

      • *

      Legend "genes" on the first matrix

      • *

      Write deconvolution ON the figure directly

      R1.25. Cell2location. Add reference, maybe explain basic functionality?

      Response:

      • *

      Cell2location was not referenced in the results section, and was only referenced in the section 4.6.2 of the methods, as the 72th citation. We corrected this oversight, and propose 1) a brief explanation of deconvolution right before, 2) a brief explanation of Cell2location particularity in inferring individual cell profiles - which is not common in spatial deconvolution.

      R1.26. reconstructing different patients, tissues, and microenvironments to predict

      context-specific molecular treatments.

      Unclear

      fibrosis in different - at

      molecular levels

      Response:

      • *

      We will modify this section title according to the reviewer's citation and the different reformulation.

      R1.27. Figure 5d myeloid and endothelial colour code inversed from 5 BC

      Response:

      • *

      The legends are individually correct, but there is no reason to not make them coherent across panels. We will update the legend of the panel 5.d..

      • *

      R1.28. 5d indicate important pathways in organe should not change the colour of the nodes (purple=common, blue or green specific). Use border colour maybe?

      Response:

      • *

      We had forgotten to precise the colour code of this panel, where the choice of orange highlighted here the gene set related to molecular pathways instead of functional annotations. As the name already explicits pathway, we now think that the orange background is redundant informations and may create some confusion. We thus would like to update Wnt and TNFA pathways backgrounds to ___ (more enriched in cell type), and purple (significantly enriched in all cell types).

      R1.29. 5e is not a venn diagram

      1. e) Venn diagram showing the overlap between transcription factors (TFs) predicted by ReCoN (green) and those previously

      implicated in fibrosis (orange) or cardiac diseases (violet). Only the top 10 TFs were annotated from literature

      sources; full sizes of fibrosis- and cardiac disease-related receptor sets can therefore not be represented.

      1. f) also not a venn diagram e/f now in supp

      the "NABA ECM collagens" gene set. Nodes are

      grouped by molecular type (e.g., transcription factors, receptors, ligands), and links represent the weighted,

      direct regulatory interactions present in the ReCoN-constructed

      Response:

      • *

      As the diagrams do not indicate the total number of receptor/TF that are in the literature, it cannot be Venn diagrams. We updated the legend to :Venn diagram showing the Overlapp between [...]

      • *

      As we reorganised the paper, these plots are now only in supplementary; we removed the duplicate occurrence in the figure 5 legend.

      R1.30. Why Sankey plot? Normally sankey plot represents flow (of regions changing from 1 state to another) but here this is just a weighted network?

      No communication from firbos back to other cell types? No communication between ventricular/myeloid/lymphoid?

      Response:

      • *

      We are thankful for this useful feedback which helped us realising interesting details were missing from the paragraph.

      • *

      *This is only intended for visualising regulatory cascade, so users have to decide on one receiving cell, a set of target genes, and sending cells. It includes a specific subset of regulatory cells, and only their interactions with the target cells. Here, we illustrated the regulation of some ECM genes produced by fibroblast. *

      • *

      Sankey Diagram might indeed not be the clearest representation, as we are not modelling the all diffusion, and not a flow per se. We propose to replace by another representation that we hope will be more intuitive for biologists (and more aesthetic), such as illustrated below:

      R1.31. as a extension to - an

      underrepresented in the current. - current framework?

      Response:

      • *

      framework works perfectly to fill the missing word in the sentence

      • *

      R1.32. However, it can't represent more - cannot

      Borrowing representation from hypergraphs, which introduces

      The network exploration implementation of ReCoN also present some limitations.

      limitations. While random walks

      with restarts offer a stable and fast exploration workflow for multilayer networks, it

      currently only considers positive weights to predict regulation strengths. It involves that the

      nature of the regulation, as activation or inhibition, has to be identified a posteriori.

      • check concordance/grammar

      Response:

      • *

      We will update the raised grammatical errors

      • *

      R1.33. Only the nodes that are included in one of the layers are present in the

      final results, ignoring the ones present only in bipartites.

      Unclear

      Response:

      • *

      Layers and bipartites are treated differently by the algorithm, and layer presence is necessary to appear in the results.

      • *

      In practice, it just means that receptors/ligands not paired in the CCC, or genes not regulated by any TF in the GRN, won't appear.

      • *

      We propose clarifying with this second explanation

      • *

      "In practice, a node must have at least one connection in its layer to appear in the final results. It thus means that receptors or ligands absent from the CCC network and genes not targeted by any transcription factor in the GRN will not receive a score from the random walk exploration."

      • *

      R1.34. a scATAC - an

      • *

      Barsi et al is published https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013188

      Response:

      • *

      We updated the reference with the published article.

      R1.35. effects, allowing for modulating in a second

      time their contribution. - word order

      Response:

      • *

      We propose to formulate "allowing in a second time to modulate their contribution"

      R1.36. others. However, it is possible to adjust the Beta coefficient to

      represent it based on the available information for each dataset.

      Represent- adjust?

      Response:

      • *

      We agree with the reviewer's suggestion to use adjust.

      R1.37. We use the latter to compare the different models. - what is the latter?

      Response:

      • *

      The latter referred to the 25 cytokines of the Immune Dictionary which had at least one connection in the inferred cell communication network with CellPhoneDB. We propose clarifying this formulation to "..."

      R1.38. It resulted in the scRNA-seq in 1,789 cells with 13,167

      genes, and for the scATAC-seq in 3,759 cells with 254,545 regions.

      Check english

      Response:

      • *

      We propose replacing this sentence by the following: "It resulted in a scRNAseq dataset of 1,789 cells with 13,167 genes, and a scATACseq dataset of 3,759 cells with 254,545 regions."

      R1.39. GRETA pipeline.- reference

      Response:

      • *

      We added the citation to the paper of the GRETA pipeline in the section 4.5 of the methods: "Badia-i-Mompel et al., 2026"

      R1.40. We kept all the cells whose annotations through unsupervised clustering,

      followed by marker gene annotations, through scANVI were coherent.

      Word order

      Response:

      • *

      We propose the following reformulation to correct the sentence: "We kept all cells whose annotations were coherent between unsupervised clustering with marker-gene labelling and scANVI-based label transfer"

      R1.41. In parallel, pairs of ligands and receptors with both associated with scores above

      an absolute gene loading of 0.1 were considered potential driver interactions in HF.

      Unclear

      Response:

      • *

      In the MOFAcell results, factors correspond to linear combination of genes that explain a large part of the data variance; the contribution of each gene is called loading. We chose the factor that classified the best patient with and without fibrosis, and kept all the top genes, all of those with a score above 0.1.

      • *

      We propose reformulating this sentence as the word "loading" could overcomplicate here for most readers: "To identify the ligand and receptors driving heart failure, we considered all of those with an absolute contribution to the multicellular factor of 0.1."

      R1.42. gseapy Python - reference?

      Response:

      • *

      The gseapy package was indeed not cited, we now include the citation : "Zhuoqing Fang, Xinyuan Liu, Gary Peltz, GSEApy: a comprehensive package for performing gene set enrichment analysis in Python, Bioinformatics, 2022;, btac757, https://doi.org/10.1093/bioinformatics/btac757"

      R1.43. and to calculate average for each spatial context the average cell type expression.

      Unclear

      Response:

      • *

      we propose to reformulate the sentence to: "These cell-type-spot profiles were used later for each spatial context to create a specific cell-cell communication networks and to calculate cell type average expressions."

      R1.44. We only used the loadings of all cell

      types but the fibroblasts to consider the effect of the sole environment.

      Unclear

      Response:

      • *

      we propose to use "APART from the fibroblast" to clarify the sentence and "to ONLY consider the environment effect".

      R1.45. We realised a downstream - performed

      Response:

      • *

      We fully agree with the reviewer's suggestion.

      R1.46. The profiles inferred by ReCoN were first very correlated in all three contexts. - unclear

      Response:

      • *

      The sentence was missing clarity and deserved being rephrased. We propose: "When looking at the absolute scores of ReCoN in all three contexts, results were initially highly correlated. To focus on context-specific differences, enrichments were performed using the log-ratio of each context profile over the mean of the other profiles."

      • *

      R1.47. Potentially the closest results are models that can predict the effect of perturbations on cell line cultures. Several approaches in the literature employ either transformers or optimal transport to predict the effect of perturbations in single cell datasets. One of the main issues is an underlying necessary assumption that the perturbation effect will be larger than the heterogeneity (in cell lines for example), which becomes increasingly difficult when considering in-vivo experiments. ReCoN obviously goes beyond this by considering explicitly the presence of different cell types but distinctions of cell types are sometimes quite arbitrary and potentially application of ReCoN to some of the in-vitro culture datasets, even on cell lines, could be a way to test its performance and benchmark it against other methods.

      The main bottleneck in the application of this framework to 'personalisation' of therapies, mentioned even in the abstract as a potential future goal for such an approach, will be the lack of data. This approach requires single cell level descriptions of the system at hand, plus additional datasets to build the model structure. To a certain extent, public data of related tissues/contexts can be used, but it will be necessary to test the dependence of performance on coherence of the input data to develop sufficient trust to use it for new predictions, especially in a medical field.

      • *

      We thank the reviewer for these reflections, which raise several distinct points that we would like to add in the discussion.

      Cell line perturbation is indeed a close and active field of research, with notably numerous models based on optimal transport and VAE and relevant benchmarks(Radig et al. 2025)*. In our view, ReCoN tries to take a complementary angle, by both focusing on the environment effect and using a network-driven approach providing explainability. *

      These perturbation methods are typically benchmarked on single cell line screenings, where cell-cell communication is highly limited or absent by design, while ReCoN is specifically designed to exploit multiple cell types interactions. Furthermore, ReCoN relies on a network that aims to provide only explainable hypotheses and molecular cascades. They also typically learn from different data, as ReCoN only uses single-cell data and best perturbation prediction methods learn from a subset of perturbation experiments.

      Exploring the performance of ReCoN in perturbation predictions would require designing extensive comparisons with the state-of-the-art taking into account all these nuances which we believe goes outside of the scope of the present study. It however still raises a fundamental question for the development of the next methods and the need to assess whether the perturbation effect is actually larger than the heterogeneity, and we propose to extend the discussion to cover these aspects.

      Secondly, this comment raised a point about cell type definition, which can be a hard task and sometimes a wrong description of cells heterogeneity. We note that even if ReCoN relies on grouping cells in some way, it does not impose any particular cell type ontology: users can define their own cell types or cell states, since the CCC layer is typically inferred from single-cell RNA-seq alone and does not require canonical cell-type annotations. This flexibility allows ReCoN to accommodate finer or coarser groupings depending on the biological question. We do not propose a framework to take into account diversity in other ways than homogeneous clusters of cells, but we think that it constitutes an interesting future development of ReCoN or new multicellular modelling methods.

      Lastly, we fully agree that an important limitation for ReCoN's use is data availability and generation, which was also a limitation when identifying datasets for the manuscript's applications. We hope that the development of open source atlases will make it easier to leverage tissue-specific prior knowledge and increase potential application, prediction performances, and trust in ReCoN results.

      In conclusion, we propose to state in the discussion two new points:

      *1) extending multicellular perturbations (including gene knock-out) to conditions where cell types cannot be defined prior to the analysis, or are more to consider across a spectrum, will be an interesting future direction. *

      2) there is new a need for broad benchmarks covering both multicellular and single-cell line tasks to evaluate the trade-off between accounting for cell heterogeneity and overall prediction accuracy.

      Radig, J., Droit, R., Doncevic, D. et al. scArchon: a scalable benchmarking framework for assessing single-cell perturbation models. Genome Biol 27, 162 (2026). https://doi.org/10.1186/s13059-026-04104-z

      R1.48. The authors could comment on how their method compares to others that do not require single cell level information. Despite clear differences, it might be important to show the advantage of using this more complex approach that requires data that is less available. Given the ease with which bulk profiles can be constructed from single cell data, it might be possible to compare the approaches directly. For example, see

      1. Wang, S. Patkar, J.S. Lee, E.M. Gertz, W. Robinson, F. Schischlik, D.R. Crawford, A.A. Schäffer, E. Ruppin Deconvolving Clinically Relevant Cellular Immune Cross-talk from Bulk Gene Expression Using CODEFACS and LIRICS Stratifies Patients with Melanoma to Anti-PD-1 Therapy

      Mike van Santvoort, Óscar Lapuente-Santana, Maria Zopoglou, Constantin Zackl, Francesca Finotello, Pim van der Hoorn, Federica Eduati,

      Mathematically mapping the network of cells in the tumor microenvironment,

      Cell Reports Methods 2025

      We propose to extend the discussion with additional methods, notably from before single-cell technology developments. We did not plan to include this two specific methods, as to our knowledge, they don't provide output directly comparable to ReCoN's purpose.

      • The first work proposes to deconvolute the bulk RNA-seq profile into cell-type-specific expression profiles. It is an interesting reference, as it could allow applying ReCoN even to bulk RNA-seq, but they do not provide comparable results, as their final task corresponds to inferring the ligand-receptor interactions, without providing downstream molecular mechanisms.
      • The second method proposed in this paper, RaCInG builds cell-to-cell networks for individual patients. They do not explore the molecular interactions inside the cells themselves, which could be used to build personalised ReCoN's model but seem to be more a prior to recent CCC than ReCoN itself.
      • *

      • *

      Reviewer #2

      R2.1. It is not clear how well it performs in independent validations. Authors showed that it can predict the effect of cytokine perturbations in the immune dictionary by selecting an optimal alpha. Authors should validate that using the same alpha value of 0.8, it is possible to accurately predict the effect of cytokine perturbations in independent datasets. This is particularly concerning for cytokine-cell type pairs where the optimal alpha is not known. Therefore, the potential utility of Recon to estimate the effect of multicellular perturbations is not well established.

      • *

      Response:

      • *

      *The reviewers raised a very relevant point by pointing out that the alpha coefficient might vary between datasets. *

      • *

      The value of 0.8 was chosen because it produced the best results in two independent datasets, the immune dictionary and the heart failure showcases. We could here observe some cross-dictionary reproducibility. To complete these findings, we will also verify that 0.8 provides the best performance in a new showcase: the Human Cytokine Dictionary (Oesinghaus et al. 2025)

      • *

      We tried to contrast this choice by opening on the need to confirm the importance of the indirect effect. We propose to add a sentence explicitly commenting on the impact of these new findings on the alpha coefficient and its robustness value.

      • *

      It is also accurate to say that ReCoN cannot currently estimate the alpha parameter autonomously. We proposed this default value as it worked on both datasets, but it is possible that no default value could fit them all. The value of alpha is currently a default value, but users are completely free in the current implementation of ReCoN to modify its value depending on their needs

      If it was not the case, one option could be to fit its value using similar prior perturbations, when such data is available. For example, perturbing one or a few cytokines, a user could choose the value that explained the best the gene expression responses.

      • *

      R2.2. Authors claimed that optimal alpha value of 0.8 implies the dominance of indirect effect. But in contrast to this claim, the performance across cytokine-celltype pair only increased from 0.72 to 0.76, which seem to imply that indirect effects do not add much.

      *Response: *

      • *

      The range of performance improvement is an interesting point to discuss for us, as it roughly doubles the computational time and consequently a trade-off between resource usage and this improvement.

      • *

      While the average improvement from combining the direct and indirect effects observed on the first showcase was around 5%, it reached more than 10% in some cell types. We consider that it still corresponds to an interesting improvement for the current task. Indeed, it here "only" incorporates the coordination of immune cells to a cytokine stimulation, which should not necessarily change their profile drastically compared to isolated exposition.

      R2.3. How does the cell-type specific effects prediction perform by just considering the intracellular layers? The authors constructed multiple variants of ReCoN to estimate unicellular and multicellular effects. How is the variant ReCoN-grn different from full ReCoN where gamma is set to zero.

      *Response: *

      • *

      We are thankful for this comment, which will help to restructure the section 2.2.

      • *

      As the ReCoN-GRN differs from the full ReCoN model, even with a gamma value of 0, as the latest include ligand-to-receptor weights. However, the ReCoN-GRN would correspond to the ReCoN-generic with an alpha of 0, which does not weight ligand-to-receptor links.

      • *

      We propose to clarify this detail in the section 2.2.2 by adding after the introduction of the ReCoN-generic model the sentence: "Note that ReCoN-grn corresponds to the ReCoN-generic model with alpha set to zero, where no indirect effects are considered. It differs from the full ReCoN model with alpha set to zero, which still includes ligand-to-receptor weights through the receptor-gene bipartite network."

      R2.4. In section 2.2, authors assert that if matching datasets are not available, GRN layer can be extracted from other datasets. How well does the GRN layer from one system generalizes to the other system in terms of perturbation prediction?

      *Response: *

      • *

      It is, of course, a complex question, as it probably strongly depends on the studied system. However, we believe while it is important to consider similar systems, using the same samples for the cell-communication and the GRN layer is not necessary.

      • *

      The first showcase that we propose explores exactly this case. We built the GRN from two unpaired datasets, and the cell communication from a third one. It provided convincing performances, justifying our earlier claim. It is additionally something done in most methods contextualising prior knowledge, which usually comes from other samples and sometimes even other organs (Browaeys, Saelens, and Saeys 2020, Jin et al. 2021, Badia-i-Mompel et al. 2023).

      • *

      To provide additional insights, we will run the new Human Cytokine Dictionary showcase using both 1) multiomics methods on external PBMC datasets, and 2) a single-cell RNA-seq only method on the Human Dictionary directly. We will then be able to show performances using both data and corresponding methods.

      • *

      To justify more clearly our claim according to reviewer's comment, we propose highlighting in the showcase itself this justification: ".... this showcase highlights the possibility to combine networks obtained from distinct datasets...".

      Related to combining datasets, we propose to clarify the reasons behind our choices for the Immune Dictionary showcase with the additional supplementary text proposed in response to the comment R1.4.

      • *

      Badia-i-Mompel P, Wessels L, Müller-Dott S et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023;24(11):739-54. https://doi.org/10.1038/s41576-023-00618-5.

      Browaeys R, Saelens W, Saeys Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 2020;17(2):159-62. https://doi.org/10.1038/s41592-019-0667-5.

      Jin S, Guerrero-Juarez CF, Zhang L et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun 2021;12(1):1088. https://doi.org/10.1038/s41467-021-21246-9.

      R2.5. In the abstract, authors claimed that ReCoN can predict the effect of gene knockouts. But authors did not show any application or validation to support this claim.

      Response:

      • *

      We indeed had no showcase that could explicitly measure the performance of ReCoN directly for gene knockout, while the possible application was introduced in the abstract.

      * We believe that ReCoN could be used in the future to infer such perturbations, but we fully agree that this claim cannot be presented without justification.

      We propose to remove the introduction of gene-knockout there, and to introduce it in the discussion opening instead, specifying that it will require specific experience and constitutes a possible future extension of the work.*

      R2.6. The communication between cells might be dependent on their spatial proximity. Is it possible to construct the CCC layer by incorporating the context-matched spatial data? How would that affect the performance of multicellular response prediction?

      Response:

      • *

      *This is a very interesting comment as numerous methods using spatial transcriptomic data have been published recently. *

      • *

      In the current formulation, the beta coefficient Bi_j modulates the impact of the cell type i on the cell type j. If the spatial transcriptomic data can inform on the proximity between cell types, and its overall impact on their communication, users could enforce more communication between some.

      • *

      However, as ReCoN is a cell-type centric model, adding spatial information can only be done at a general scale, or by modelling independently spatial regions such as presented in the Microenvironments heart infarction showcase. It means that ReCoN cannot beneficiate from the potential of spatial transcriptomic as much as models representing the tissue structure.

      R2.7. In the fibroblast application in Fig 4d, based on the cardiac cell types expression in region type, they are predicting fibroblast gene expression. Wouldn't the most direct benchmarking be comparison with observed fibroblast expression from the ST (after deconvolution perhaps)?

      Response:

      • *

      This was a helpful comment to guide the restructuration of the microenvironment heart infarction showcase, as we believe the whole showcase objective was not formulated clearly enough.

      • *

      We aim at modelling the impact of the environment on the transcriptome. As the complete transcriptome of a cell results from numerous interacting variables, we believe that comparing the correlation between ReCoN's scores and the transcriptome would not evaluate the prediction of the environment impact.

      • *

      For this reason, we wanted to compare the results to the specific differences from the microenvironment. We focused on gene set enrichment that seemed less noisy for such a comparative experiment, in particular from Visium10X data that has a particularly high dropout rate.

      • *

      We propose to strengthen the validation by providing molecular insights into the three groups of cells studied.

      The spatial data themselves are bulk, adding a layer of noise over the small number of genes captured by Visium. Instead of a correlation with the deconvoluted spots, we have equivalent single-cell RNA-seq fibroblast data annotated in the same study, which matches the three modelled niches. We propose to conduct a differential expression here and try to compute a correlation between these groups and ReCoN scores, providing a quantitative analysis.

      If the correlation was low because of the noise in the data (notably leading to the permutation of individual gene orders even if overall biological signals and gene set orders are conserved), we will additionally do a pathway enrichment over this data, enriching also the qualitative validation.

      R2.8. Section 2.6 Besides the cytokine section, it is difficult to assess the added value of this approach. Likely there is a lot of valuable findings here but difficult to say because the assessment is very qualitative.

      Response:

      • *

      One of the challenges around this work was to find relevant dataset to evaluate ReCoN. We tried to complete the direct quantitative evaluation from the Immune Dictionary with another quantitive evaluation from the heart atlas multicellular programs, despite a much less direct validation.

      • *

      We hope that the production of new perturbation experiments over multicellular datasets, especially cell-type targeted perturbations, will provide more opportunities to validate the different findings and claim from our current manuscript.

      • *

      On a similar note, no method seemed proposing similar predictions to be compared to. It led to the use of Nichenet score and the current decomposition of the ReCoN model in the section 2.2.1 to evaluate the contribution of the model.

      R2.9. The article is dense and writing should be reorganized for better readability.

      Minor issues -

      No p-values in figures.

      *Response: *

      • *

      We agree that integrating values directly in the panels would make the reading of the figure easier. We would like to introduce the p-values in the panels 2d, 2e, 2f, 2g. We had forgot to indicate in the legend of the panel 4.d that all bold scores were associated with a p-value *

      R2.10. Typo - ReCoN-genetic should be - ReCoN-generic.

      • *

      Response:

      • *

      We are thankful for noticing the typo and corrected it in the new version.

      • *

      R2.11. Authors may consider adding figures to describe their results on balance between direct and indirect effects in section 2.2.2.

      • *

      Response:

      • *

      Depending on the new findings on the indirect effect iterations, we propose adding an additional panel on their combination or a supplementary figure.

      • *

      R2.12. Redundancy in the following two lines -

      o While these approaches effectively describe what tissue-wide programs are coordinated, they generally offer limited insight into the molecular mechanisms that establish or regulate these programs.

      o Despite their ability to identify coordinated tissue-wide programs, multicellular program analyses typically offer limited insight into the underlying molecular mechanisms that orchestrate these programs.

      • *

      Response:

      • *

      We propose in the version of the manuscript to remove the first sentence. In our opinion, starting the next paragraph by this clarification seems more helpful to guide the reader than having it at the end of the previous one.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      R2.13. The direct and indirect effects are treated in two separate steps. In reality of course these effects are operating simultaneously. I wonder if this could be better modelled by iterating through the two steps. It might be worthwhile

      trying to see if that improves the performance.

      We thank the reviewer for this interesting idea, and propose to add a supplementary text to present the result of this discussion to the readers.

      • *

      The direct effect is supposed to be measurable from the first iteration only, as we try to represent the effect of direct receptor binding. Regarding the indirect effect, iterations could be done to model the indirect effect, which could represent more distant effect in time.

      • *

      On an algorithmic note, the indirect effect already allow several "iterations" of this effect, as each random walk can loop between all cell types until restart. However, it does not allow to control the weight of the different successive transition. In practice, with a high restart probability, an extreme weight is given to the first "iteration" over the second, as there is three layers to cross to explore the next cell.

      • *

      First, we propose clarifying this section of the manuscript, to explain the depth of the indirect effect explorations.

      • *

      Biologically, it is highly possible that these iterations have an important role to explain the complete reaction of the cells. However, we believe that it hits a major limitation of our modelling, and RWR based exploration in general, as it goes against the enforcement of restarts.

      • *

      We aim to represent pairwise measurements, representing the impact of one node on another. But random walks without restart are not naturally well fitted to this problem, as they naturally converge to a stationary distribution ((László, Lov, and Erdos 1996)). In the case of ReCoN, it means that each gene and receptor, if we pushed the exploration indefinitely, would have the same probability to end up on each node of the system.

      • *

      The restart mitigates this impact and enforces the impacts of the seeds by ensuring that the walkers stay close to the seed. (Tong, Faloutsos, and Pan 2006). By iterating successively from the new distribution obtained from the RWR, we would go against this important probability and progressively converge toward the stationary distribution from classical random walks.

      • *

      So we completely share the opinion of the reviewer that the iterative nature of the indirect effect should be explored too, but we don't believe that ReCoN can model them accurately. We hope that new exploration methods will be able to decipher the importance of these iterations, once additional arguments have been gathered to justify the global interest of considering the indirect effect.

      • *

      Bibliography:

      • *

      László L, Lov L, Erdos O. Random Walks on Graphs: A Survey. 1 Jan. 1996:1-46.

      • *

      Tong H, Faloutsos C, Pan J yu. Fast Random Walk with Restart and Its Applications. Sixth Int Conf Data Min ICDM06 Dec. 2006:613-22. https://doi.org/10.1109/ICDM.2006.70.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary - This is an interesting paper where the authors predict the multicellular response to the molecular perturbations. The idea is somewhat novel and offers a conceptual enhancement by modelling the multicellular response as collective outcome of cell intrinsic gene regulatory changes coupled with cell-cell communication by using a simple network diffusion-based approach. We have a few comments to help strengthen the work.

      • It is not clear how well it performs in independent validations. Authors showed that it can predict the effect of cytokine perturbations in the immune dictionary by selecting an optimal alpha. Authors should validate that using the same alpha value of 0.8, it is possible to accurately predict the effect of cytokine perturbations in independent datasets. This is particularly concerning for cytokine-cell type pairs where the optimal alpha is not known. Therefore, the potential utility of Recon to estimate the effect of multicellular perturbations is not well established.
      • Authors claimed that optimal alpha value of 0.8 implies the dominance of indirect effect. But in contrast to this claim, the performance across cytokine-celltype pair only increased from 0.72 to 0.76, which seem to imply that indirect effects do not add much.
      • How does the cell-type specific effects prediction perform by just considering the intracellular layers? The authors constructed multiple variants of ReCoN to estimate unicellular and multicellular effects. How is the variant ReCoN-grn different from full ReCoN where gamma is set to zero.
      • In section 2.2, authors assert that if matching datasets are not available, GRN layer can be extracted from other datasets. How well does the GRN layer from one system generalizes to the other system in terms of perturbation prediction?
      • In the abstract, authors claimed that ReCoN can predict the effect of gene knockouts. But authors did not show any application or validation to support this claim.
      • The communication between cells might be dependent on their spatial proximity. Is it possible to construct the CCC layer by incorporating the context matched spatial data? How would that affect the performance of multicellular response prediction?
      • The direct and indirect effects are treated in two separate steps. In reality of course these effects are operating simultaneously. I wonder if this could be better modelled by iterating through the two steps. It might be worthwhile trying to see if that improves the performance.
      • In the fibroblast application in Fig 4d, based on the cardiac cell types expression in region type, they are predicting fibroblast gene expression. Wouldn't the most direct benchmarking be comparison with observed fibroblast expression from the ST (after deconvolution perhaps)?
      • Section 2.6 Besides the cytokine section, it is difficult to assess the added value of this approach. Likely there is a lot of valuable findings here but difficult to say because the assessment is very qualitative.
      • The article is dense and writing should be reorganized for better readability.

      Minor issues

      • No p-values in figures.
      • Typo - ReCoN-genetic should be - ReCoN-generic.
      • Authors may consider adding figures to describe their results on balance between direct and indirect effects in section 2.2.2.
      • Redundancy in the following two lines -
        • While these approaches effectively describe what tissue-wide programs are coordinated, they generally offer limited insight into the molecular mechanisms that establish or regulate these programs.
        • Despite their ability to identify coordinated tissue-wide programs, multicellular program analyses typically offer limited insight into the underlying molecular mechanisms that orchestrate these programs.

      Significance

      This is an interesting paper where the authors predict the multicellular response to the molecular perturbations. The idea is somewhat novel and offers a conceptual enhancement by modelling the multicellular response as collective outcome of cell intrinsic gene regulatory changes coupled with cell-cell communication by using a simple network diffusion-based approach.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors propose an approach to model complex regulatory processes in tissue or cell collections in specific environments taking into account intra- cellular regulatory processes at multiple levels and inter-cellular communication, importantly offering a chance to estimate the importance of indirect effects of perturbations on one cell type via processes in other cell types. Increasingly more complete models allow testing the impact of each component and of integrating data as context-specific information versus general prior knowledge. 3 main use cases are provided exploiting public datases: prediction of the effect of specific in-vivo cytokine perturbations on mouse lymph node tissues Healthy and disease myocardium in a heart failure multiome dataset Myocardial infarction spatial transcriptomics to identify how different cellular neighbourhoods are related to fibroblast phenotype and fibrosis The main framework is an extension of their previous HuMMus framework to investigate multilayer networks of regulation within a single cell type to also consider inter-cellular interactions, thus including i) tf-target GRN, ii) receptor a receptor layer based on PPI, and cell-cell communication based on LR interactions. These complex networks are then explored within the framework of Random Walk with Restart, which allows to establish 'interaction weights' between different nodes in the network, based on repeated simulations of spreading on the network that thus produce scores of proximity between network nodes, across possible paths. In this study first RWR that only allow intra-cell type walks are performed to calculate direct interaction of perturbation on node states, then RWRs across layers are also enabled, to calculate the importance of inter-cell interactions (via coeff gamma). The importance of each cell type is given by another coeff B that can either correspond to cell type proportions or spatial proximity of cell pairs and finally the scores of within and inter-cell interactions are weighted with a coefficient alpha.

      The central contribution that allows coupling of intra with inter-cellular interactions is the establishment of receptor-gene links. Instead of inferring it from data, they propose to express the receptor-gene matrix as: R = L ⋅ G taking ligand-receptor (L) and ligand-gene (G) adjacency matrices from NicheNet and using NNLS to compute R.

      Generally, for all these cases, comparison between performance in inferring the effect of perturbation or the upstream regulators or downstream targets are provided with assessment of AUROC/AUPRC values.

      • Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      This is a very well-written paper, the methods used are adequate and the use cases are relevant and broad, exploiting state of the art datasets and tools.

      The author's claims are mostly justified. The authors could make an effort to more explicitly cite other efforts in similar directions. The claim 'We envision ReCoN as a extension to prior multicellular modelling, offering an interesting compromise between prediction of cell type responses and understanding of their molecular coordination.' is very general and could be better substantiated. In fact, the authors do not really give examples of alternative approaches to study systems of interacting cells, other than mechanistic agent based models, that clearly are very different. Moreover, the exploration of the multilayer networks with RWR is a very reasonable choice but could there be other approaches? I think the authors could discuss this issue to briefly support their choice of this method.

      Generally the discussion should provide the reader the context in the existing literature in which the work can be set, detailing its impact. I think this could be improved.

      Regarding the choice of datasets, it is clear that the method is quite demanding, requiring single cell and different omics to build the model, in addition to the expression dataset that is used as a use case. This inevitably leads to using a mix of datasets. For example in the mouse experiments the gene regulatory network was inferred from both a lymph node scRNA-seq dataset and a splenic scATAC-seq dataset, presumably due to the lack of multiome data in this setting. However the cell-cell communication network was inferred from the control case of the Immune Dictionary. Why can't the authors use the control data also for inferring GRNs? Is atac-seq really necessary in the inference of the GRN? What is the impact of the fact that lymph node and spleen samples might be different?

      '

      • Please request additional experiments only if they are essential for the conclusions. Alternatively, ask the authors to qualify their claims as preliminary or speculative, or to remove them altogether.

      • If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL".

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments.

      • Are the data and the methods presented in such a way that they can be reproduced? The code is very clear, we were able to install and run it and it is quite well-documented. However, a few more details should be given in the text regarding how the evaluation of the performance is carried out. For example: If I understand correctly, when predicting the impact of cytokine perturbations the ReCoN predictions of genes impacted are compared to differentially expressed genes identified through traditional DEG analysis. What is compared is the ranking of these genes from ReCoN with the ranking provided by DEseq2. There is no description of how this comparison of ranking gives rise to AUROC values. Also, is it just the ranking that is predicted or can they also estimate how well they can predict the effect size?

      When describing the use cases, I think a bit more detail would help. For example 'To identify the cell-type-specific genes associated with HF, we used the MOFAcell scores of the multicellular factor 1 (MCP1) reported in ReHeat236' I supposed the explanation is on the dataset but for the sake of clarity it would be good to expand this sentence to give at least an idea of the approach.

      Regarding the calculation of the R matrix from the NichNet matrices L and G, I gather that the R matrix is calculated once and is thus fully data-independent and available just like the L and G matrices from NichNet. This was not very clear in the tutorials.

      Also, this might just be a typo in the tutorial: 'The default α = 0.8 gives more weight to direct effects, which has been empirically validated. You can adjust this based on your biological question." I believe the manuscript says alpha>0.5 refers to indirect effects dominating.

      Same for the pre-processing of the spatial data for the third use case, a little more details on how this was done would help the users and readers.

      • Are the experiments adequately replicated and statistical analysis adequate? I don't see issues with the statistical power of the analysis. Rather, I think the authors should provide some examination of the parameter space for their model. Whereas ana analysis of the impact of the Alpha parameter is provided, I believe there are several more parameters that have a crucial impact and choices for their values should be discussed.

      For example 'In the GRN reconstruction only the links with a score above 1.5e-7 were retained in ReCoN's gene regulatory layer. How was this chosen?

      We have identified the following parameters that are somehow justified but could be explored to have a better feel for how they impact the results

      Restart probability: How often the walker goes back to the starting seed/molecule Layer transition probability: How often the walker stays in the same layer - different cell? - different layers? Gamma Node transition within a layer: How often one jumps to a different layer Weighting parameters: How much weight for direct or indirect effect to account for the combined effect - alpha - this is the only one that is explicitly explored.

      Finally, this might be considered OPTIONAL but would greatly improve the work in our opinion: The method crucially depends on the networks that are used in the different layers and to connect layers and cell types. As we know, biological data is noisy and incomplete (FP and FN) at each level and in each datatype. It would be really useful to estimate what is the robustness of the results to this noise. Particularly, from personal experience, we think the GRNs reconstructed from data are often almost fully connected and it is exceedingly difficult to validate them in specific contexts. This means that some 'errors' are likely to be present. Since several methods exist for inferring GRNs one could simply compare the results using different methods for this part of the network. A related point involves the characteristics of the RWR algorithm, that will be quite impacted by the presence of hubs in these networks (either in single layers or across several) that is likely to impact the exploration. If proteins that are hub are effectively important, that is not a problem, but in some layers, for example, the receptor-receptor layer that presumably will contain PPIs, there might be biases in hubs being just better studied proteins, and these hubs might have an 'unjustified' weight in the walks. One potential approach to assess the robustness of the method to these issues could be an empirical one that just randomly perturbs the networks in ReCoN to see to what extent similar predictions are achieved.

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately?
      • Are the text and figures clear and accurate?
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Please add page numbers. Figures are nice and clear. Some specific minor points are listed here below.

      Define hMLN on first appearance fig1 caption (no page numbers..;) 2nd appearance heterogeneous multilayer structure (HMLN) ... Bi_j not so clear to what it refers when first mentioned personalized interaction specificity. - maybe better word than personalised (contextualised?) ReCoN-genetic and ReCoN, ( generic?) responses. It is expected to observe common behaviors in-between cell-type, that the GRN and the generic CCC network already contribute captures. - not very clear

      Figure 2b the icon of cells with double arrows might suggest phenotype shift when instead this is just communication eTACs explain acronym and what they are Due to very few genes being differentially expressed, only cDC1 was conserved and evaluated for IL22, Not so clear In this showcase (not very clear, use case?) different fibroblast specializations - maybe phenotypes?

      Figure 4b b) Schematic view of the deconvolution process and cell type-specific count inference from the spatial niches. Not so clear what the heatmap shows, rows and columns Spots heatmap : label niche on rectangles in cols And each col is a spot Rows are cell types or cells? In the cell types x spot

      Cell2location. Add reference, maybe explain basic functionality?

      reconstructing different patients, tissues, and microenvironments to predict context-specific molecular treatments. Unclear fibrosis in different - at molecular levels

      Figure 5d myeloid and endothelial colour code inversed from 5 BC 5d indicate important pathways in organe should not change the colour of the nodes (purple=common, blue or green specific). Use border colour maybe? 5e is not a venn diagram e) Venn diagram showing the overlap between transcription factors (TFs) predicted by ReCoN (green) and those previously implicated in fibrosis (orange) or cardiac diseases (violet). Only the top 10 TFs were annotated from literature sources; full sizes of fibrosis- and cardiac disease-related receptor sets can therefore not be represented. f) also not a venn diagram e/f now in supp the "NABA ECM collagens" gene set. Nodes are grouped by molecular type (e.g., transcription factors, receptors, ligands), and links represent the weighted, direct regulatory interactions present in the ReCoN-constructed

      Why Sankey plot? Normally sankey plot represents flow (of regions changing from 1 state to another) but here this is just a weighted network? No communication from firbos back to other cell types? No communication between ventricular/myeloid/lymphoid?

      as a extension to - an underrepresented in the current. - current framework? However, it can't represent more - cannot Borrowing representation from hypergraphs, which introduces The network exploration implementation of ReCoN also present some limitations. limitations. While random walks with restarts offer a stable and fast exploration workflow for multilayer networks, it currently only considers positive weights to predict regulation strengths. It involves that the nature of the regulation, as activation or inhibition, has to be identified a posteriori.

      • check concordance/grammar

      Only the nodes that are included in one of the layers are present in the final results, ignoring the ones present only in bipartites. Unclear a scATAC - an Barsi et al is published https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013188 effects, allowing for modulating in a second time their contribution. - word order

      others. However, it is possible to adjust the Beta coefficient to represent it based on the available information for each dataset. Represent- adjust?

      We use the latter to compare the different models. - what is the latter?

      It resulted in the scRNA-seq in 1,789 cells with 13,167 genes, and for the scATAC-seq in 3,759 cells with 254,545 regions. Check english GRETA pipeline.- reference

      We kept all the cells whose annotations through unsupervised clustering, followed by marker gene annotations, through scANVI were coherent. Word order In parallel, pairs of ligands and receptors with both associated with scores above an absolute gene loading of 0.1 were considered potential driver interactions in HF. Unclear gseapy Python - reference?

      and to calculate average for each spatial context the average cell type expression. Unclear

      We only used the loadings of all cell types but the fibroblasts to consider the effect of the sole environment. Unclear We realised a downstream - performed

      The profiles inferred by ReCoN were first very correlated in all three contexts. - unclear

      Significance

      Provide contextual information to readers (editors and researchers) about the novelty of the study, its value for the field and the communities that might be interested.

      This is a very timely paper, dealing with an important gap in the literature. It is not an entirely new framework, but it integrates different existing approaches to solve a complex issue in a creative way. To my knowledge, it is the first attempt to consider and formalise regulation processes involving both intra- and inter-cellular interactions. The results support the importance of distinguishing the different paths that can relate the impact of a perturbation to specific genes/functions in different cells and their overall ecosystem.

      General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

      The tool offers a combination of approaches, providing a coherent framework. The code is well documented and functional. The use cases are quite compelling. Sadly, the only type of validation possible involves confirmation of known facts from the literature, which makes it hard to evaluate the full impact of some of the predictions. I think the details of how the method works and especially how the performance was evaluated could be expanded and an assessment of how different parameters and choices impact the results would also be very helpful. An effort to compare the presented variations of the method to some other approach would be very welcome, but I am finding it hard to identify what an alternative approach could be comparable.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      Potentially the closest results are models that can predict the effect of perturbations on cell line cultures. Several approaches in the literature employ either transformers or optimal transport to predict the effect of perturbations in single cell datasets. One of the main issues is an underlying necessary assumption that the perturbation effect will be larger than the heterogeneity (in cell lines for example), which becomes increasingly difficult when considering in-vivo experiments. ReCoN obviously goes beyond this by considering explicitly the presence of different cell types but distinctions of cell types are sometimes quite arbitrary and potentially application of ReCoN to some of the in-vitro culture datasets, even on cell lines, could be a way to test its performance and benchmark it against other methods. The main bottleneck in the application of this framework to 'personalisation' of therapies, mentioned even in the abstract as a potential future goal for such an approach, will be the lack of data. This approach requires single cell level descriptions of the system at hand, plus additional datasets to build the model structure. To a certain extent, public data of related tissues/contexts can be used, but it will be necessary to test the dependence of performance on coherence of the input data to develop sufficient trust to use it for new predictions, especially in a medical field.

      The authors could comment on how their method compares to others that do not require single cell level information. Despite clear differences, it might be important to show the advantage of using this more complex approach that requires data that is less available. Given the ease with which bulk profiles can be constructed from single cell data, it might be possible to compare the approaches directly. For example, see K. Wang, S. Patkar, J.S. Lee, E.M. Gertz, W. Robinson, F. Schischlik, D.R. Crawford, A.A. Schäffer, E. Ruppin Deconvolving Clinically Relevant Cellular Immune Cross-talk from Bulk Gene Expression Using CODEFACS and LIRICS Stratifies Patients with Melanoma to Anti-PD-1 Therapy

      Mike van Santvoort, Óscar Lapuente-Santana, Maria Zopoglou, Constantin Zackl, Francesca Finotello, Pim van der Hoorn, Federica Eduati, Mathematically mapping the network of cells in the tumor microenvironment, Cell Reports Methods 2025

      Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

      Broad interest to biomedical researchers and also biologists in other fields. While the method allows advances in basic research on biological process regulation, a clear clinical application can be envisaged in immuno-oncology for example/ immunology and even general molecular medicine.

      Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am a computational biologist with expertise in network models, regulatory networks, agent-based models and especially familiar with the tumour microenvironment and processes therein. I can more or less appreciate the meaningfulness of the biological findings related to the mouse lymphnode example. I am much less of an expert on heart tissue modeling, heart failure, fibrosis etc, required to fully comprehend the impact of the second and third use cases.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ In this manuscript, the authors describe the discovery of a molecular regulator of the immune transcriptional program, which is activated by intestinal distension upon bacterial colonization of the C. elegans intestine. Taking advantage of the fact that inhibition of aex-5 is known to cause intestinal distension and a C-type lectin gene clec-60 as a marker for the immune response to intestinal distension (clec-60p::gfp), the authors performed a forward genetic screen for suppressors of the immune response activation. Of the two mutants isolated, they focused on the stronger suppressor, which corresponded to a cysteine-type DUB, the Ubiquitin Specific Peptidase-14 (usp-14). Through rescue experiments, phenocopy analyses, and quantitative RT-PCR, they validated usp-14 as the causal gene and initiated characterization of its role in immune response activation. To this end, the authors investigated the tissue of action, identifying the intestine as the tissue in which usp-14 mediates the regulation of the immune response. Through transcriptomic analyses, they found that the signalling pathway likely regulated by usp-14 in response to intestinal distension is the Wnt pathway, as they have observed reduction in the transcriptional level of some of the Wnt pathway components in usp-4(tm1481), in response to infection with S. aureus. Additionally, transcriptomic data indicate that usp-14 plays a role in immunity regulation even in the absence of infection. Based on these findings, the authors propose that usp-14 has a dual role in immune regulation: one in surveillance immunity, preventing overactivation of immune responses, and another as a mediator of pathogen-induced responses, such as those triggered by P. aeruginosa or S. aureus. The experiments are rigorous and the results robust; however, some points would benefit from further investigation or clarification. __Response: We thank the reviewer for an excellent summary of our work and for the valuable feedback.

      Comment: The expression domain of usp-14 appears to be quite expanded based on single cell RNAseq data (e.g. PMID: 28818938) therefore it is likely that the transgenes used for expression analysis are lacking key regulatory information. Alternative methods like smFISH would be more appropriate to characterise the spatiotemporal pattern of usp-14 expression in more detail. Response: We thank the reviewer for this valuable suggestion. In the original version of the manuscript, we used a 714 bp region upstream of the usp-14 start codon to generate the transcriptional reporter. In the revised manuscript, we reconstructed the reporter using a longer 1924 bp upstream promoter region together with a portion of exon 1. Using this updated reporter, we observed substantially broader expression of usp-14, particularly during the early larval stages. These results are described on page 6, lines 147-152: “We next examined the spatiotemporal expression pattern of usp-14 in C. elegans. To this end, we generated transgenic worms expressing GFP under the control of the usp-14 promoter (usp-14p::gfp). During early larval development, usp-14 was broadly expressed across multiple tissues (Figure 3A). However, in L4 larvae and adult animals, expression became more restricted and was predominantly observed in the intestine and a subset of neuronal cells. Notably, both intestinal and neuronal expression persisted throughout development (Figure 3A).

      Comment: __The mutation mapped in usp-14(jsn19) is a missense mutation (E122K) that suppresses the immune response to a degree comparable to the usp-14(tm1481) deletion allele. However, the authors do not show the functional domains in Fig. 1E potentially affected by this missense mutation. __Response: We have now updated Figure 1E to include the functional domains of USP-14 and mapped both the usp-14(jsn19) missense allele and the usp-14(tm1481) deletion allele onto the protein schematic.

      Comment: __How USP-14 regulates Wnt and how Wnt signalling relates to activation of immune responses is not fully supported. Are the Wnt components mentioned in the study induced specifically in the intestine upon infection and does USP-14 act in the intestine in the context of this regulation? How do the authors interpret that both Wnt ligands and receptors are induced ? Does Wnt signalling appear as a GO term in the transcriptomic analysis? The authors can include Wnt signalling components in the analysis of the transcriptomic results. __Response: We thank the reviewer for these insightful comments. Previous studies have shown that the Wnt pathway components examined in our study are induced in the intestine upon infection and function within the intestine to regulate host defense against bacterial pathogens (PMID: 29768179; PMID: 36323254).

      We did not observe significant enrichment of Wnt signaling terms in the GO analysis of our transcriptomic dataset. We believe this is likely due to the stringent thresholds used for differential expression analysis (fold change > 2 and p At present, the precise mechanism by which USP-14 regulates Wnt pathway components remains unclear. One possibility is that USP-14 influences Wnt signaling indirectly through additional substrates or interacting proteins that regulate transcriptional outputs. We have now clarified this point in the Discussion (page 12, lines 340–345): “These observations raise the possibility that additional USP-14 substrates or interacting proteins modulate transcriptional outputs downstream of intestinal distension. Future studies aimed at identifying the direct substrates of USP-14 and defining how USP-14 interfaces with neuronal ACC-4 signaling and other distension-responsive pathways will provide important mechanistic insight into how intestinal distension is coupled to innate immune activation.

      Regarding the simultaneous induction of Wnt ligands and receptors, we interpret this as a potential amplification or reinforcement mechanism that enhances Wnt/β-catenin signaling during infection-induced intestinal distension. However, further studies will be required to determine the mechanistic significance of this coordinated transcriptional regulation.

      Comment: __Overall, in most of the figures, the micrographs are in general quite dark and exhibit poor contrast between signal and background, particularly in Fig. 1, panels B and J, and Fig. 2, panels B and F (upper rows). Even though these panels are intended to show absence of response, the outlines of the worms are difficult to discern. __Response: We thank the reviewer for the feedback. We have now improved the image presentation throughout the manuscript by either increasing the intensity or adding dotted outlines to more clearly indicate worm positions.

      Comment: __In Figure S3, panels A and B, the pmk-1(km25); usp-14(tm1481) animals subjected to aex-5 RNAi show some level of fluorescence/response induction comparable to pmk-1(km25) alone. This observation is not discussed in the text. __Response: We have now discussed this observation in the text. These results are described on page 9, lines 240-244: “Although pmk-1(km25);usp-14(tm1481) worms displayed relatively higher GFP levels than usp-14(tm1481) single mutants upon aex-5 RNAi treatment, this effect likely reflects the elevated basal GFP expression observed in pmk-1(km25) mutants (Figure S4B). Importantly, pmk-1(km25);usp-14(tm1481) animals still exhibited significantly lower GFP levels than pmk-1(km25) single mutants.

      Reviewer #1 (Significance (Required)): __ __Comment: __The work is interesting because it expands some previous work in the field demonstrating immune response induction as a consequence of intestinal distension even in the absence of bacterial infection. This is known to be mediated by the neuronal acetylcholine receptor ACC-4, which signals to the intestine where it regulates immune genes via the Wnt pathway. However, how USP-14 relates to ACC-4 is currently unclear and whether USP-14 function is really required in the intestine to control Wnt signalling is not demonstrated. The authors should include a model to describe how their findings relate to the previous literature and how USP-14 may link mechanistically to Wnt signalling pathway activation. __Response: We thank the reviewer for this insightful comment. We agree that the relationship between USP-14, ACC-4, and Wnt signaling requires further clarification. As suggested by the reviewer, we have now included a model summarizing the current understanding of intestinal distension-induced immune activation and integrating our findings with previous literature (Figure 6H).

      Comment: __It remains also unclear whether usp-14 is the only deubiquitinase involved in intestinal distension-induced signalling via the Wnt pathway, or whether other paralog usp genes might also contribute to regulation of immune-responsive transcription. Notably, several mammalian deubiquitinases have established roles in cancer suppression and inflammatory response and innate immunity in other systems so this would increase the potential significance of the work. __Response: We thank the reviewer for this valuable suggestion. To systematically examine whether additional DUBs contribute to intestinal distension-induced immune activation, we performed an RNAi screen targeting all DUBs available in the Ahringer RNAi library using the aex-5(sa23);clec-60p::gfp reporter strain. Among the DUBs tested, knockdown of usp-14 produced the strongest suppression of clec-60p::gfp expression. Although knockdown of usp-5 also partially suppressed GFP induction, usp-5 RNAi did not affect survival during P. aeruginosa infection, suggesting that usp-5 is not required for host defense under these conditions. Together, these findings identify USP-14 as the major DUB required for intestinal distension-induced immune activation in our experimental system. These results are now included in Figure 1G, H, and Figure S2.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ Summary C. elegans are soil-dwelling nematodes that feed on bacteria and fungi and thus must be able to distinguish between innocuous and pathogenic species of microbes to survive. Though they lack adaptive immunity, these animals have an ancient version of an innate immune system that has no circulating sentinel or phagocytic cells yet can still mount a response to pathogen exposure. A consequence of the mode of infection of some ingested bacterial pathogens is intestinal distension which by itself, even in the absence of pathogens, is sufficient to trigger the expression of genes encoding immune effectors, including proteins that are bactericidal. The complete mechanistic scheme connecting intestinal distension to the expression of immunity genes has not been resolved, motivating the authors to perform a forward genetic screen for additional components of this pathway. One mutant that the authors isolated was usp-14, encoding an evolutionarily conserved deubiqutinating enzyme. Functional analysis revealed that usp-14 confers protection from microbial pathogens and that the intestine is its primary site of action for its role in host defense. The authors' data indicate that while USP-14 regulates the expression of innate immunity genes that are induced by intestinal distension, surprisingly it functions independently of several canonical innate immune signaling pathways, including the pmk-1/p38 MAPK pathway. Instead, USP-14 appears to act through Wnt signaling to regulate immune effectors by upregulating the expression of several components of that pathway, including the C. elegans ß-catenin ortholog bar-1. This places usp-14 within a gut-brain axis previously shown to control the C. elegans innate immune response through acetylcholine-mediated activation of Wnt signaling. The authors' findings provide new mechanistic insight to this pathway and add to the understanding of ubiqutination as an immune regulatory module. __Response: We thank the reviewer for providing an excellent summary of our work.

      Major comments __1. There are three types of experiments in which the authors use the same set of controls across several different figure panels, as stated in the legend to Figure 2. First, when quantifying GFP levels of clec-60::gfp in RNAi-treated animals, the authors use the same clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls for Fig. 1K, 2C, and 2G. For infection assays with S. aureus NCTC8325, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2E are the same as the ones used in Fig. 1M. Similarly, for infection assays with P. aeruginosa PA14, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2I is the same as was used for Fig 1I. In each case, if the authors in fact collected all of the data for each strain that they studied at the same time but then chose to parse larger datasets into separate figure panels to make it more clear to the reader, then this approach is valid but the authors need to explicitly state that this is what they did. However, if the data pertaining to the control strains were collected at a different time or if it comes from a separate biological replicate, then re-using data from the controls is not appropriate because it would not accurately reflect the specific conditions of the experiment to which the data are being compared. If this is indeed the scenario, then the authors will need to repeat these experiments and include the appropriate control in each iteration. __Response: While preparing the manuscript, these experiments were performed simultaneously. Therefore, all panels that share controls have results from experiments performed simultaneously and represent the same biological replicate. We have added this additional information in the relevant figure legends.

      Comment: __2. From the legends describing figure panels that include data pertaining to clec-60p::gfp expression levels as assessed by fluorescence microscopy it seems that, in general, the authors measured GFP fluorescence in about 30 animals to produce quantitative data. How many biological replicates of these types of experiments were carried out? This is not explicitly stated in the section describing fluorescence imaging in the Methods section. Following the description of their methodology regarding statistical analysis of survival curves from microbial infection assays, however, the authors state that, "[a]ll experiments were performed independently at least three times unless otherwise noted." Does this statement apply to microscopy or only to experiments involving infection assays? If the data reporting quantitation of GFP signal is based on only 30 animals, then additional biological replicates are necessary, along with appropriate statistical analyses. __Response: The quantified GFP fluorescence data are derived from three independent biological replicates. In each experiment, we typically imaged and quantified approximately 10 worms per condition, yielding a total of ~30 worms analyzed per genotype or treatment across all replicates (except Figure S1B, where we had two independent replicates). We have added the number of experiments in the figure legends for these data.

      Comment: __3. The authors have made all of the RNASeq data publicly available on the Sequence Read Archive, and they include data from several pairwise comparisons for differential gene expression analysis in their supplemental files. One of the most important facts to come out of the authors' Gene Ontology analyses of their RNASeq data is that the genes that are upregulated in a usp-14-dependent manner upon intestinal distension are enriched for those whose products play a role in innate immunity/host defense. The authors should say more about these genes. Are there any commonalities between them with regard to function? Are any of them targets of transcription factors that are known to function in C. elegans innate immunity? If so, this could provide clues as to what the substrates of USP-14 might be. Importantly, the specific identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense should be revealed in a supplemental file, and designated as being dependent on or independent of usp-14 for their expression during intestinal distension. __Response: We thank the reviewer for this insightful suggestion. We have now expanded the Results section to describe the functional categories enriched among the USP-14-dependent intestinal distension-induced immune genes, including C-type lectins, ShK toxin domain-containing proteins, and lysozymes (page 7, lines 193-195).

      In addition, we compared our transcriptomic dataset with previously published transcription factor-regulated gene sets using WormExp analysis and identified a substantial overlap with genes regulated by the GATA transcription factor ELT-2. These new analyses are described on page 7, lines 196-206: “To identify transcription factors potentially involved in intestinal distension-induced immune activation, we performed transcription factor enrichment analysis using WormExp on genes upregulated in N2 worms following aex-5 RNAi treatment. This analysis revealed a substantial overlap between aex-5 RNAi-induced genes and genes regulated by the GATA transcription factor ELT-2 (Figure S3D). We next examined whether USP-14-dependent immune genes overlapped with ELT-2-dependent immunity genes induced by intestinal distension. To this end, we identified innate immune genes common to both ELT-2-regulated gene sets and aex-5 RNAi-induced genes. Strikingly, these ELT-2-dependent intestinal distension-induced immune genes showed substantial overlap with USP-14-dependent immune genes (Figure S3E and Table S5), suggesting that USP-14 may regulate distension-induced immunity, at least in part, through ELT-2-dependent transcriptional programs.

      Finally, we have created a new table (Table S5) that specifies the identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense, for USP-14-dependent and independent genes.

      Comment: __4. The authors' data suggest that in response to bacterial infection USP-14 upregulates the expression of bar-1, along with other components of the Wnt signaling pathway, which in turn upregulates innate immunity genes. This could be further substantiated by directly demonstrating that there are USP-14-regulated innate immunity genes whose induced expression in the presence of microbial pathogens also requires bar-1. Along those lines, an initial test would be to assess clec-60p::gfp expression in bar-1 animals versus bar-1;usp-14 double mutants, similar to the experiment whose results are reported in Fig. S4. If generating the bar-1;usp-14 double mutant is not feasible, then RNAi could be used to knockdown bar-1 expression in clec-60p::gfp;usp-14(tm1481) animals. To expand this analysis, the expression of the six innate immunity genes shown to be regulated upon intestinal distension in usp-14-dependent manner could be measured in the presence and absence of intestinal distension or microbial infection in bar-1 and bar-1;usp-14 animals by qRT-PCR. At a minimum, the authors should conduct a bioinformatics analysis to compare the USP-14-regulated innate immunity genes identified in their RNAseq studies to lists of known BAR-1 transcriptional targets to look for potential overlap. __Response: We agree that extending these analyses to qRT-PCR experiments examining additional immune genes would be informative. However, both bar-1 mutants and bar-1 RNAi-treated worms exhibited severe developmental and physiological defects, including sick and dead animals during development, likely reflecting the pleiotropic developmental roles of BAR-1. Although fluorescence imaging and survival assays could be performed by selectively transferring surviving adults, we were concerned that bulk collection of worms for qRT-PCR analyses would introduce confounding effects arising from developmental defects and reduced viability.

      To further address the reviewer’s suggestion, we carried out a comparative analysis between USP-14-dependent intestinal distension-induced immune genes and previously identified BAR-1-dependent immune genes. Although transcriptome-wide datasets for BAR-1-dependent pathogen-induced immune genes are not currently available, an earlier study identified seven immune response genes regulated by BAR-1 during infection (PMID: 18981407). We found that six of these genes overlap with the USP-14-dependent intestinal distension-induced immune genes identified in our study. These analyses have now been added to the Results section and included in Table S5.

      Comment: __5. While in their Discussion section the authors mention evolutionarily conserved roles for protein ubiquitination as means of immunomodulation, there are few if any comments regarding ubiqutination as a regulatory scheme in C. elegans innate immunity or how their findings enhance our understanding of this phenomenon. Ubiquitination affects C. elegans immunity at multiple levels, from avoidance behavior to gene regulation, and it seems appropriate for the authors to address this in order to more fully contextualize their findings. __Response: We thank the reviewer for the suggestion. We have now added a new paragraph to the Discussion that places our findings in the context of the existing literature on ubiquitination, deubiquitination, and innate immunity in C. elegans. The discussion is added on pages 10-11, lines 295-308: “Although ubiquitin-mediated signaling has emerged as a central regulator of innate immunity across metazoans (Jiang & Chen, 2011; Mello-Vieira & Dikic, 2026), the contribution of DUBs to host defense in C. elegans remains poorly understood. Previous studies in C. elegans have shown that ubiquitin-dependent processes regulate diverse aspects of immunity, including immune surveillance, xenophagy, and pathogen tolerance (Garcia-Sanchez et al, 2021). Perturbations in proteasome function have also been shown to activate surveillance immunity (Ghosh & Singh, 2026; Troemel et al, 2026), highlighting the importance of ubiquitin-associated pathways in sensing pathogen-induced cellular damage. However, most prior studies have focused on ubiquitin ligases, proteasome-associated pathways, or global ubiquitin signaling rather than on specific DUBs directly regulating antibacterial immune responses. To our knowledge, our study provides the first direct evidence that a specific DUB regulates antibacterial innate immunity in C. elegans. Thus, our findings establish USP-14 as a previously unrecognized regulator of host defense and identify deubiquitination as an important regulatory layer in intestinal distension-mediated immunity.

      __Minor comments __1. In the Results section, the authors state that "[k]nockdown of cec-10 led to only a marginal decrease in survival during P. aeruginosa infection" (lines 92 and 93) and that cec-10 "has minimal impact on C. elegans survival during infection" (lines 93 and 94). However, as reported in Supplemental Table 5 the magnitude of the calculated difference in mean survival time between animals treated with RNAi targeting cec-10 and untreated control animals (-20% to -24% and statistically significant in 3/3 replicates) closely approximates the difference in mean survival between usp-14 mutants and controls (-19% to -28% and statistically significant in 3/3 replicates), which the authors clearly find to be significant. If by this metric usp-14 is important for host defense, then so too is cec-10. In light of this, the authors should use different language to describe the impact of cec-10 knockdown on the susceptibility of C. elegans to microbial infection and the potential role of cec-10 in immunity.

      Response: We chose not to pursue cec-10 further primarily because it lacks a clear human homolog and because the mutant exhibited reduced expression of the co-injection marker, raising the possibility of broader transgene-related effects. We have modified the text on page 4, lines 92-96: “Knockdown of cec-10 resulted in a significant reduction in survival during P. aeruginosa infection (Figure S1C). However, we did not pursue cec-10 further for two reasons: (i) cec-10(jsn20) mutants exhibited a modest but significant reduction in the myo-2p::mCherry co-injection marker (Figure 1D), raising the possibility of broader transgene-related defects, and (ii) cec-10 lacks a clear human homolog.

      Comment: __2. All of the micrographs in Fig. 1B appear very dark. The GFP expression in the control animals appears dim, making it difficult for the reader to compare the signal in those animals to the GFP expression levels in the mutants. I recommend adjusting the brightness level in an equivalent manner across all of the micrographs to account for this. __Response: We have increased the brightness of all the images, as suggested by the reviewer.

      __Comment: __3. Fig. 1E depicts a gene structure diagram for usp-14 with the position of the point mutation in the jsn19 allele isolated in the authors' forward genetic screen indicated by the amino acid substitution symbol drawn over the second exon. Instead of mixing gene- and protein-level information about the jsn19 allele, I recommend replacing the gene structure diagram with a domain structure diagram of the USP-14 protein that depicts the conserved C19 peptidase and ubiquitin-like domains. The relative position of the E122K substitution should still be noted. __Response: __We have now updated Figure 1E to include the functional domains of USP-14 and mapped both the usp-14(jsn19) missense allele and the usp-14(tm1481) deletion allele onto the protein schematic.

      Comment: __4. Since all of the information in Fig. 1F appears elsewhere in the text, I recommend eliminating this panel. __Response: We have removed it.

      Comment: __5. Regarding the RNAseq analysis, the authors state that 1241 genes are upregulated upon aex-5 knockdown (line 162). The authors then ask which of these genes are regulated by usp-14 in the context of intestinal distension and find that 633 are upregulated a usp-14-dependent manner when aex-5 is targeted by RNAi and that 595 are upregulated even in the absence of usp-14 (Fig. 3D). This accounts for 1228 genes in total, not 1241. Can the authors explain this discrepancy? __Response: We thank the reviewer for carefully noting this discrepancy. The difference arises from the criteria used to classify genes into the categories shown in Figure 5D (previously Figure 3D). Specifically, genes uniquely upregulated in usp-14(tm1481) worms were defined as genes that were either exclusively induced in usp-14(tm1481) worms or expressed at levels more than 2-fold higher in usp-14(tm1481) worms compared to N2 worms. During this classification, 13 genes that were initially identified as upregulated in N2 worms following aex-5 RNAi were found to be expressed at levels more than 2-fold higher in usp-14(tm1481) worms than in N2 worms (Table S4). These genes were therefore reassigned to the “usp-14(tm1481)-specific” category in the Venn diagram. Consequently, the total number of genes represented in the Venn diagram becomes 1228 instead of 1241. To clarify this point, we have now added an explanation to the figure legend.

      Comment: __6. For the sake of clarity, in the legend to Fig. 3D I recommend expanding the description of the categories of genes depicted in the Venn diagram by using the same language as in the first worksheet of Supplemental Table 4. __Response: We thank the reviewer for the suggestion. We have now added these details to the legend of Figure 5D (previously Figure 3D). The legend reads: “(D) Venn diagram showing the overlap between genes upregulated upon aex-5 RNAi in N2 and usp-14(tm1481) worms. The GO analyses for the biological processes of unique and common genes are shown. USP-14-dependent genes were defined as genes that were either exclusively upregulated in N2 worms or expressed at levels greater than 2-fold higher in N2 worms than in usp-14(tm1481) worms. USP-14-independent genes were defined as genes upregulated in both N2 and usp-14(tm1481) worms with expression differences of less than 2-fold between the two strains. Genes uniquely upregulated in usp-14(tm1481) worms were defined as genes that were either exclusively induced in usp-14(tm1481) worms or expressed at levels greater than 2-fold higher in usp-14(tm1481) worms than in N2 worms. Thirteen genes classified as upregulated in N2 worms were more than 2-fold higher in usp-14(tm1481) worms than in N2 worms (Table S4) and were therefore included in the usp-14(tm1481)-specific category.

      Comment: __7. In Fig. 4B, the authors' annotation indicates that there is a statistically significant difference (**, p __Comment: __8. In Fig. S5, the shade of blue used to represent the data from the nhr-49(nr2041);usp-14(tm1481);clec60p::gfp animals in panel E is different from that used to represent data from the same animals in panel B. This breaks the pattern of all of the other panels of this figure in which the data pertaining to a given phenotype are depicted in the same color. Also, in the symbol key in panel E there is an extra semi-colon before clec-60p::gfp that should be eliminated in the second genotype notation. __Response: We thank the reviewer for carefully examining the figure and for bringing these issues to our attention. We have made the changes.

      Comment: __9. The authors' data show that USP-14 regulates bar-1 expression, and in the Discussion section they mention that in mammals beta-catenin is a substrate of USP14. Can the authors comment on the possibility of/evidence for BAR-1 autoregulation in C. elegans and the prospect of it being facilitated by USP-14? This could be a minor point to add to the Discussion. __Response: In both contexts, USP-14 appears to stabilize BAR-1 by regulating it at either the transcriptional or post-translational level. However, it is currently unknown whether BAR-1 regulates USP-14 expression and thereby participates in an autoregulatory mechanism. Nevertheless, we have added to the Discussion that USP14 may regulate the Wnt pathway through both transcriptional and post-translational mechanisms, depending on the biological context. __Reviewer #2 (Significance (Required)): __ The study described in this manuscript ties in to the findings from two prior genetic screens carried out in C. elegans that aimed to identify immune regulators (Ren et al., Cell Reports, 2022 and Labed et al., Immunity, 2018). Though their strategies differed, both of these previous studies uncovered a role for acetylcholine receptors in modulating the response to ingested microbial pathogens, especially when infection is associated with intestinal distension, indicating that a neuron-to-gut axis controls innate immunity in C. elegans. Labed and colleagues were the first to show that activation of this pathway results in the upregulation of genes encoding Wnt signaling pathway components, including the worm ortholog of beta-catenin called bar-1, which are necessary for the expression of immune effectors in the intestine. The Labed study also revealed that protein ubiquitination could contribute to regulating host defense gene induction because knockdown of lin-23, the substate binding subunit of a ubiquitin ligase complex that mediates BAR-1 degradation, results in constitutive expression of clec-60p::gfp, the same transcription reporter used by Ghosh and Singh as a readout for the expression of innate immunity genes. In their screen that revisits the Ren et al. approach, Ghosh and Singh find that another protein implicated in regulating protein stability via ubiquitination status, USP-14, also controls the expression of innate immunity genes in response to intestinal distension. Interestingly, their data indicate that it does so by upregulating bar-1. This discovery therefore adds an element of mechanistic detail regarding the regulation of Wnt signaling in immunity. While the Labed data suggest that ubiquitination may regulate BAR-1 at the post-translational level, Ghosh and Singhs' results indicate a second layer of regulation of bar-1 at the transcriptional level that also appears to involve ubiquitination. In this case, USP-14 is predicted to modulate the ubiquitination status of a yet-to-be-identified substrate that directly or indirectly governs bar-1 expression. The authors' findings thus bring the field closer to having a complete picture of the Ach-Wnt pathway in C. elegans. As they point out in the Discussion section of their manuscript, ubiquitination is an evolutionarily conserved yet complex means of tuning the immune system. The work described here helps to shed light on this important immune regulatory mode and could have implications for aspects of epithelial immunity that are in common to both invertebrates and vertebrates.

      Response: We thank the reviewer for providing such a thoughtful overview of the field and for placing our findings in the context of previous studies on intestinal distension-induced immunity in C. elegans. We also sincerely appreciate the reviewer’s constructive feedback and insightful comments, which have helped us improve the quality and clarity of the manuscript.

      My research interest and specific area of expertise pertains to evolutionarily conserved genetic pathways that control healthspan through affecting cellular resilience later in life. Using C. elegans as a surrogate for aging humans, my group studies age-dependent changes in the activity of regulatory modules that protect older animals from the molecular damage associated with intrinsic and extrinsic sources of cellular stress, with a particular emphasis on microbial infection and oxidative stress.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      C. elegans are soil-dwelling nematodes that feed on bacteria and fungi and thus must be able to distinguish between innocuous and pathogenic species of microbes to survive. Though they lack adaptive immunity, these animals have an ancient version of an innate immune system that has no circulating sentinel or phagocytic cells yet can still mount a response to pathogen exposure. A consequence of the mode of infection of some ingested bacterial pathogens is intestinal distension which by itself, even in the absence of pathogens, is sufficient to trigger the expression of genes encoding immune effectors, including proteins that are bactericidal. The complete mechanistic scheme connecting intestinal distension to the expression of immunity genes has not been resolved, motivating the authors to perform a forward genetic screen for additional components of this pathway. One mutant that the authors isolated was usp-14, encoding an evolutionarily conserved deubiqutinating enzyme. Functional analysis revealed that usp-14 confers protection from microbial pathogens and that the intestine is its primary site of action for its role in host defense. The authors' data indicate that while USP-14 regulates the expression of innate immunity genes that are induced by intestinal distension, surprisingly it functions independently of several canonical innate immune signaling pathways, including the pmk-1/p38 MAPK pathway. Instead, USP-14 appears to act through Wnt signaling to regulate immune effectors by upregulating the expression of several components of that pathway, including the C. elegans ß-catenin ortholog bar-1. This places usp-14 within a gut-brain axis previously shown to control the C. elegans innate immune response through acetylcholine-mediated activation of Wnt signaling. The authors' findings provide new mechanistic insight to this pathway and add to the understanding of ubiqutination as an immune regulatory module.

      Major comments

      1. There are three types of experiments in which the authors use the same set of controls across several different figure panels, as stated in the legend to Figure 2. First, when quantifying GFP levels of clec-60::gfp in RNAi-treated animals, the authors use the same clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls for Fig. 1K, 2C, and 2G. For infection assays with S. aureus NCTC8325, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2E are the same as the ones used in Fig. 1M. Similarly, for infection assays with P. aeruginosa PA14, the survival plots for the clec-60p::gfp and usp-14(jsn19);clec-60p::gfp controls shown in Fig. 2I is the same as was used for Fig 1I. In each case, if the authors in fact collected all of the data for each strain that they studied at the same time but then chose to parse larger datasets into separate figure panels to make it more clear to the reader, then this approach is valid but the authors need to explicitly state that this is what they did. However, if the data pertaining to the control strains were collected at a different time or if it comes from a separate biological replicate, then re-using data from the controls is not appropriate because it would not accurately reflect the specific conditions of the experiment to which the data are being compared. If this is indeed the scenario, then the authors will need to repeat these experiments and include the appropriate control in each iteration.
      2. From the legends describing figure panels that include data pertaining to clec-60p::gfp expression levels as assessed by fluorescence microscopy it seems that, in general, the authors measured GFP fluorescence in about 30 animals to produce quantitative data. How many biological replicates of these types of experiments were carried out? This is not explicitly stated in the section describing fluorescence imaging in the Methods section. Following the description of their methodology regarding statistical analysis of survival curves from microbial infection assays, however, the authors state that, "[a]ll experiments were performed independently at least three times unless otherwise noted." Does this statement apply to microscopy or only to experiments involving infection assays? If the data reporting quantitation of GFP signal is based on only 30 animals, then additional biological replicates are necessary, along with appropriate statistical analyses.
      3. The authors have made all of the RNASeq data publicly available on the Sequence Read Archive, and they include data from several pairwise comparisons for differential gene expression analysis in their supplemental files. One of the most important facts to come out of the authors' Gene Ontology analyses of their RNASeq data is that the genes that are upregulated in a usp-14-dependent manner upon intestinal distension are enriched for those whose products play a role in innate immunity/host defense. The authors should say more about these genes. Are there any commonalities between them with regard to function? Are any of them targets of transcription factors that are known to function in C. elegans innate immunity? If so, this could provide clues as to what the substrates of USP-14 might be. Importantly, the specific identity of the genes assigned in the GO analyses to biological processes pertaining to innate immunity and host defense should be revealed in a supplemental file, and designated as being dependent on or independent of usp-14 for their expression during intestinal distension.
      4. The authors' data suggest that in response to bacterial infection USP-14 upregulates the expression of bar-1, along with other components of the Wnt signaling pathway, which in turn upregulates innate immunity genes. This could be further substantiated by directly demonstrating that there are USP-14-regulated innate immunity genes whose induced expression in the presence of microbial pathogens also requires bar-1. Along those lines, an initial test would be to assess clec-60p::gfp expression in bar-1 animals versus bar-1;usp-14 double mutants, similar to the experiment whose results are reported in Fig. S4. If generating the bar-1;usp-14 double mutant is not feasible, then RNAi could be used to knockdown bar-1 expression in clec-60p::gfp;usp-14(tm1481) animals. To expand this analysis, the expression of the six innate immunity genes shown to be regulated upon intestinal distension in usp-14-dependent manner could be measured in the presence and absence of intestinal distension or microbial infection in bar-1 and bar-1;usp-14 animals by qRT-PCR. At a minimum, the authors should conduct a bioinformatics analysis to compare the USP-14-regulated innate immunity genes identified in their RNAseq studies to lists of known BAR-1 transcriptional targets to look for potential overlap.
      5. While in their Discussion section the authors mention evolutionarily conserved roles for protein ubiquitination as means of immunomodulation, there are few if any comments regarding ubiqutination as a regulatory scheme in C. elegans innate immunity or how their findings enhance our understanding of this phenomenon. Ubiquitination affects C. elegans immunity at multiple levels, from avoidance behavior to gene regulation, and it seems appropriate for the authors to address this in order to more fully contextualize their findings.

      Minor comments

      1. In the Results section, the authors state that "[k]nockdown of cec-10 led to only a marginal decrease in survival during P. aeruginosa infection" (lines 92 and 93) and that cec-10 "has minimal impact on C. elegans survival during infection" (lines 93 and 94). However, as reported in Supplemental Table 5 the magnitude of the calculated difference in mean survival time between animals treated with RNAi targeting cec-10 and untreated control animals (-20% to -24% and statistically significant in 3/3 replicates) closely approximates the difference in mean survival between usp-14 mutants and controls (-19% to -28% and statistically significant in 3/3 replicates), which the authors clearly find to be significant. If by this metric usp-14 is important for host defense, then so too is cec-10. In light of this, the authors should use different language to describe the impact of cec-10 knockdown on the susceptibility of C. elegans to microbial infection and the potential role of cec-10 in immunity.
      2. All of the micrographs in Fig. 1B appear very dark. The GFP expression in the control animals appears dim, making it difficult for the reader to compare the signal in those animals to the GFP expression levels in the mutants. I recommend adjusting the brightness level in an equivalent manner across all of the micrographs to account for this.
      3. Fig. 1E depicts a gene structure diagram for usp-14 with the position of the point mutation in the jsn19 allele isolated in the authors' forward genetic screen indicated by the amino acid substitution symbol drawn over the second exon. Instead of mixing gene- and protein-level information about the jsn19 allele, I recommend replacing the gene structure diagram with a domain structure diagram of the USP-14 protein that depicts the conserved C19 peptidase and ubiquitin-like domains. The relative position of the E122K substitution should still be noted.
      4. Since all of the information in Fig. 1F appears elsewhere in the text, I recommend eliminating this panel.
      5. Regarding the RNAseq analysis, the authors state that 1241 genes are upregulated upon aex-5 knockdown (line 162). The authors then ask which of these genes are regulated by usp-14 in the context of intestinal distension and find that 633 are upregulated a usp-14-dependent manner when aex-5 is targeted by RNAi and that 595 are upregulated even in the absence of usp-14 (Fig. 3D). This accounts for 1228 genes in total, not 1241. Can the authors explain this discrepancy?
      6. For the sake of clarity, in the legend to Fig. 3D I recommend expanding the description of the categories of genes depicted in the Venn diagram by using the same language as in the first worksheet of Supplemental Table 4.
      7. In Fig. 4B, the authors' annotation indicates that there is a statistically significant difference (**, p<0.01) in the fluorescence signal from clec-60p::gfp in usp-14(jsn19);aex-5(sa23);clec-60p::gfp_EV versus usp-14(jsn19);aex-5(sa23);clec-60p::gfp_bar-1 animals. This is likely a typographical error that should be changed to "ns" to indicate no significant difference in the fluorescence signal between these two groups, which is consistent with what the data show and with the authors' description of these data in the text (lines 211-214).
      8. In Fig. S5, the shade of blue used to represent the data from the nhr-49(nr2041);usp-14(tm1481);clec60p::gfp animals in panel E is different from that used to represent data from the same animals in panel B. This breaks the pattern of all of the other panels of this figure in which the data pertaining to a given phenotype are depicted in the same color. Also, in the symbol key in panel E there is an extra semi-colon before clec-60p::gfp that should be eliminated in the second genotype notation.
      9. The authors' data show that USP-14 regulates bar-1 expression, and in the Discussion section they mention that in mammals beta-catenin is a substrate of USP14. Can the authors comment on the possibility of/evidence for BAR-1 autoregulation in C. elegans and the prospect of it being facilitated by USP-14? This could be a minor point to add to the Discussion.

      Significance

      The study described in this manuscript ties in to the findings from two prior genetic screens carried out in C. elegans that aimed to identify immune regulators (Ren et al., Cell Reports, 2022 and Labed et al., Immunity, 2018). Though their strategies differed, both of these previous studies uncovered a role for acetylcholine receptors in modulating the response to ingested microbial pathogens, especially when infection is associated with intestinal distension, indicating that a neuron-to-gut axis controls innate immunity in C. elegans. Labed and colleagues were the first to show that activation of this pathway results in the upregulation of genes encoding Wnt signaling pathway components, including the worm ortholog of beta-catenin called bar-1, which are necessary for the expression of immune effectors in the intestine. The Labed study also revealed that protein ubiquitination could contribute to regulating host defense gene induction because knockdown of lin-23, the substate binding subunit of a ubiquitin ligase complex that mediates BAR-1 degradation, results in constitutive expression of clec-60p::gfp, the same transcription reporter used by Ghosh and Singh as a readout for the expression of innate immunity genes. In their screen that revisits the Ren et al. approach, Ghosh and Singh find that another protein implicated in regulating protein stability via ubiquitination status, USP-14, also controls the expression of innate immunity genes in response to intestinal distension. Interestingly, their data indicate that it does so by upregulating bar-1. This discovery therefore adds an element of mechanistic detail regarding the regulation of Wnt signaling in immunity. While the Labed data suggest that ubiquitination may regulate BAR-1 at the post-translational level, Ghosh and Singhs' results indicate a second layer of regulation of bar-1 at the transcriptional level that also appears to involve ubiquitination. In this case, USP-14 is predicted to modulate the ubiquitination status of a yet-to-be-identified substrate that directly or indirectly governs bar-1 expression. The authors' findings thus bring the field closer to having a complete picture of the Ach-Wnt pathway in C. elegans. As they point out in the Discussion section of their manuscript, ubiquitination is an evolutionarily conserved yet complex means of tuning the immune system. The work described here helps to shed light on this important immune regulatory mode and could have implications for aspects of epithelial immunity that are in common to both invertebrates and vertebrates.

      My research interest and specific area of expertise pertains to evolutionarily conserved genetic pathways that control healthspan through affecting cellular resilience later in life. Using C. elegans as a surrogate for aging humans, my group studies age-dependent changes in the activity of regulatory modules that protect older animals from the molecular damage associated with intrinsic and extrinsic sources of cellular stress, with a particular emphasis on microbial infection and oxidative stress.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors describe the discovery of a molecular regulator of the immune transcriptional program, which is activated by intestinal distension upon bacterial colonization of the C. elegans intestine. Taking advantage of the fact that inhibition of aex-5 is known to cause intestinal distension and a C-type lectin gene clec-60 as a marker for the immune response to intestinal distension (clec-60p::gfp), the authors performed a forward genetic screen for suppressors of the immune response activation. Of the two mutants isolated, they focused on the stronger suppressor, which corresponded to a cysteine-type DUB, the Ubiquitin Specific Peptidase-14 (usp-14). Through rescue experiments, phenocopy analyses, and quantitative RT-PCR, they validated usp-14 as the causal gene and initiated characterization of its role in immune response activation. To this end, the authors investigated the tissue of action, identifying the intestine as the tissue in which usp-14 mediates the regulation of the immune response. Through transcriptomic analyses, they found that the signalling pathway likely regulated by usp-14 in response to intestinal distension is the Wnt pathway, as they have observed reduction in the transcriptional level of some of the Wnt pathway components in usp-4(tm1481), in response to infection with S. aureus. Additionally, transcriptomic data indicate that usp-14 plays a role in immunity regulation even in the absence of infection. Based on these findings, the authors propose that usp-14 has a dual role in immune regulation: one in surveillance immunity, preventing overactivation of immune responses, and another as a mediator of pathogen-induced responses, such as those triggered by P. aeruginosa or S. aureus. The experiments are rigorous and the results robust; however, some points would benefit from further investigation or clarification.

      The expression domain of usp-14 appears to be quite expanded based on single cell RNAseq data (e.g. PMID: 28818938) therefore it is likely that the transgenes used for expression analysis are lacking key regulatory information. Alternative methods like smFISH would be more appropriate to characterise the spatiotemporal pattern of usp-14 expression in more detail.

      The mutation mapped in usp-14(jsn19) is a missense mutation (E122K) that suppresses the immune response to a degree comparable to the usp-14(tm1481) deletion allele. However, the authors do not show the functional domains in Fig. 1E potentially affected by this missense mutation.

      How USP-14 regulates Wnt and how Wnt signalling relates to activation of immune responses is not fully supported. Are the Wnt components mentioned in the study induced specifically in the intestine upon infection and does USP-14 act in the intestine in the context of this regulation? How do the authors interpret that both Wnt ligands and receptors are induced ? Does Wnt signalling appear as a GO term in the transcriptomic analysis? The authors can include Wnt signalling components in the analysis of the transcriptomic results.

      Overall, in most of the figures, the micrographs are in general quite dark and exhibit poor contrast between signal and background, particularly in Fig. 1, panels B and J, and Fig. 2, panels B and F (upper rows). Even though these panels are intended to show absence of response, the outlines of the worms are difficult to discern.

      In Figure S3, panels A and B, the pmk-1(km25); usp-14(tm1481) animals subjected to aex-5 RNAi show some level of fluorescence/response induction comparable to pmk-1(km25) alone. This observation is not discussed in the text.

      Significance

      The work is interesting because it expands some previous work in the field demonstrating immune response induction as a consequence of intestinal distension even in the absence of bacterial infection. This is known to be mediated by the neuronal acetylcholine receptor ACC-4, which signals to the intestine where it regulates immune genes via the Wnt pathway. However, how USP-14 relates to ACC-4 is currently unclear and whether USP-14 function is really required in the intestine to control Wnt signalling is not demonstrated. The authors should include a model to describe how their findings relate to the previous literature and how USP-14 may link mechanistically to Wnt signalling pathway activation.

      It remains also unclear whether usp-14 is the only deubiquitinase involved in intestinal distension-induced signalling via the Wnt pathway, or whether other paralog usp genes might also contribute to regulation of immune-responsive transcription. Notably, several mammalian deubiquitinases have established roles in cancer suppression and inflammatory response and innate immunity in other systems so this would increase the potential significance of the work.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript titled "Multiple Molecular Pathways to Longevity: Opposing Gene Expression Programs Define Distinct Aging Strategies", the authors investigated diverse genetic pathways that contribute to lifespan extension in Caenorhabditis elegans and aimed to identify shared and distinct molecular mechanisms among various longevity mutants. Through comprehensive RNA sequencing of different longevity mutants representing seven distinct pathways, the authors showed that these mutants cluster into three primary groups based on their gene expression profiles. This transcriptomic analysis revealed that while some longevity genes are commonly regulated across multiple pathways, others exhibit opposing expression patterns, suggesting that distinct molecular strategies can lead to increased lifespan. Specifically, they identified a set of 196 genes that are consistently upregulated in most longevity mutants, many of which are involved in innate immunity and stress defense. By performing RNAi-based screening, the authors further validated the functional roles of several candidates, including C08F11.7, ugt-62, and K05C4.9, supporting their contributions to longevity and stress resistance. The authors conclude that longevity is mediated through multiple molecular pathways and provide a public online tool to study these complex transcriptomic landscapes.

      Major comments

      1. While the authors identified a set of 196 upregulated genes, the rationale for narrowing these down to the three final candidates (C08F11.7, ugt-62, and K05C4.9) is not clearly described. The authors show that genetic inhibition of several genes, including DC2.5, C05B5.5, T07C4.5, and W03B1.7, decreases lifespan in both nuo-6 mutants and wild-type animals. However, the authors did not describe why these additional validated candidates, which also showed significant effects on longevity, were not pursued for further characterization. The authors should explicitly state the criteria used to prioritize these three genes over the other validated genes.
      2. The authors conclude that longevity can be mediated by multiple molecular pathways. However, it remains unclear whether these distinct strategies can operate simultaneously or are mutually exclusive. The authors need to test whether lifespan extension in a Group 1 mutant is further enhanced or suppressed by the knockdown of a key Group 2-specific genes. These experiments would help determine these pathways act additively, antagonistically, or as partially redundant survival programs.
      3. The authors provide interesting data on overexpression of the three candidate genes. However, whereas C08F11.7 clearly demonstrates both necessity and sufficiency for lifespan extension, overexpression of ugt-62 and K05C4.9 does not independently extend lifespan. To strengthen the manuscript, the authors should expand the discussion of these divergent results and clarify possible explanations.
      4. Key citations are missing and the authors should add multiple citations including the following ones. Please cite the following paper and discuss the authors' finding with respect to the related work (Lee et al PMID: 40814218). Add citations in the sentence describing changes in the transcriptome of C. elegans associated with age (Lee et al., PMID: 38508494). Furthermore, please cite papers describing the overviews of survival assay using C. elegans (Kwon et al., PMID: 40436148, Hwang et al., PMID: 40436147).

      Minor comments

      1. To improve readability, please provide the full names for all abbreviations at their first appearance in the manuscript.
      2. Please ensure that the labels in the figures match the text exactly. For instance, if different promoters are used for generating overexpression animals, it may be helpful to indicate the specific promoter in the figure panel or legend for clarity.
      3. For all lifespan and stress resistance assays, please include the total number of animals (n) and the number of independent biological replicates (N) in the figure legends to confirm statistical reliability.
      4. Please clearly specify the exact developmental stage of the animals used for the survival assays in the Materials and Methods section.

      Referees cross-commenting

      I also agree with reviewer #1's comments and recommend revision to further improve the manuscript.

      Significance

      This study provides a systematic, side-by-side transcriptomic comparison of nine genetically distinct long-lived C. elegans mutants, revealing that lifespan extension arises from both shared and opposing gene expression programs. By identifying three distinct longevity groups and demonstrating that key pathways can be modulated in opposite directions to achieve long life, the work challenges the notion of a single universal transcriptional signature of aging. Importantly, functional validation shows that select commonly regulated genes can directly modulate lifespan and stress resistance, highlighting actionable molecular targets for promoting healthy aging.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript by Rudich ZD et al. systematically profiled the transcriptomic changes in nine long-lived C. elegans mutants and presented a careful and informative comparative analysis of these aging-related changes. In addition to these valuable datasets and bioinformatics analyses, the authors performed a large-scale RNAi screen to assess the role of the differentially expressed genes (DEGs) in these mutants and identify several potential targets to promote healthy aging. Moreover, the authors have provided a user-friendly website to examine genes of interest in those longevity mutants from their datasets.

      Major comments

      The conclusions of this manuscript are generally well supported. The study is also technically sound. Yet, I still have a few concerns that should be carefully addressed.

      1. Although I myself believe that the datasets in this study should be more consistent and comprehensive, the authors should perform a data mining analysis of previously reported transcriptomic changes of these mutants or similar mutants in the same longevity pathway and compare the reported changes with their findings to highlight the necessity and advances of this study.
      2. This manuscript does not perform any regulon or transcription factor (TF) analyses. TFs are the drivers of the transcriptomic changes and multiple conserved TFs (e.g., daf-16) have already been identified in these pathways. Therefore, it is necessary to examine and compare the regulons/TFs in these new datasets by bioinformatics. Such analyses can: a) provide more information of the driving force of these transcriptomic changes; b) show the role of these known longevity TFs; c) propose new TFs driving longevity; d) support the findings of 'longevity strategies' and 'longevity groups' from the perspective of TFs.
      3. osm-5 and daf-2 are categorized into two different groups in this study. Since the longevity of cilia (-) mutants is through daf-16, the same master TF driving daf-2 longevity, please perform further analyses or discussion to clarify this issue.
      4. This manuscript focused on genes whose RNAi suppressed the mutants longevity. Please also use bioinformatics to analyze the functions of those whose RNAi extends the mutants longevity, because these genes could tell the health price these mutants pay and help improve ageing interventions by reducing side effects.
      5. (OPTIONAL) I strongly suggest a comprehensive comparison of these transcriptomic changes in long-lived mutants with published age-related transcriptomic changes in wild type worms. This comparison will significantly All the suggested analyses are pure bioinformatics and should be realistic to finish in several months.

      Minor comments

      I also have a few minor comments on data presentation: 1. Please further clarify the analysis of DEGs correlated with lifespan extension in Fig. 2 by a depiction. In Fig. 2C and D, please label data dots from different strains with different colors. 2. In Fig. 3 and S20, please label the percentage of overlapping genes on top of each bars.

      Referees cross-commenting

      I agree with Reviewer #2's comments and would suggest giving the authors enough time to revise their manuscript.

      Significance

      Compared to previous transcriptomic analyses of these mutants in differen reports, this study minimized the technical variations and benefitted from the advances in RNA-Seq technology and bioinformatics tools. Therefore, it should provide a more consistent and comprehensive view of the molecular mechanisms underlying the longevity of these mutants. The datasets in this manuscript are valuable to other researchers in the biology of aging.

      Meanwhile, since these mutants have been extensively studied, the advance of this study in unknown mechanisms remains limited. Therefore, I would recommend its publication as a 'Resource' article after addressing my concerns.

      (I am an expert in the biology of ageing, using C. elegans and mouse as major models.)

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their constructive evaluation of our manuscript. We are pleased by the overwhelmingly positive consensus regarding the quality and significance of our data. In particular, the reviewers highlighted that this is a "nice, clean study with interesting data" and noted that our in vivo functional genetic findings in the Drosophila wing are "clearly a strength" that "moves the paper beyond cell-culture correlations" to provide a "simple, straightforward take-home message".

      The principal critique across the reports concerns the extent of direct mechanistic evidence linking Groucho (Gro) to regulation of the early elongation checkpoint. Several reviewers suggested additional genomic experiments, including RNA-seq, PRO-seq, or Pol II ChIP approaches, to further examine transcription and pausing behaviour. However, we would like to flag up that genomic datasets addressing these questions across multiple Drosophila cell lines have already been published previously, including work from our own group and others.

      The primary objective of the current study is therefore not to replicate these existing genomic analyses, but rather to build directly upon them. We identify a consistent genomic association between Gro and pausing/elongation factors across cell types. Importantly, we extend these findings beyond genomic correlations through in vivo genetic analysis in the developing Drosophila wing.

      1. Description of the planned revisions

      • *

      • *

      Reviewer 1

      The figures and text could lay out the logic of the genetic interactions for non-Drosophila readers. For example, the comparison of single and double copies of Gro-RNAi to combinatorial knockdowns, when it is additive, and when it is interpreted as synergistic.

      The statistical analyses presented in Figure 5C, including Fisher’s exact tests comparing phenotype distributions between genotypes, were intended to address the distinction between additive and synergistic genetic interactions. However, we agree that the presentation of these comparisons could potentially be made clearer for readers less familiar with Drosophila genetic interaction assays. We would therefore be open to revising the presentation of Figure 5 and the accompanying explanatory text following editorial guidance and with consideration of the intended readership of the eventual journal.

      The statistical analysis of the phenotype distributions should be shown more clearly (Fig. 5B).

      Figure 5B is intended to present the distribution of observed phenotypic classes and does not include statistical comparisons. A similar analysis has been published for experiments looking at the phenotypes of moderate Groucho overexpression in the wing in the presence of HDAC inhibitors (Winkler et al., 2010 doi.10.1371/journal.pone.0010166). Statistical analyses of the genetic interaction experiments are presented separately in 5C. We therefore believe the current presentation of Figure 5B is appropriate for illustrating phenotype frequencies rather than statistical inference, but we will consider moving this panel to the Supplementary material.

      Minor comments

      -Figure 5 would gain clarity if the phenotype classes/panel letters were shown more clearly on the images. -The legends of the wing figures should be expanded, especially for readers outside the Drosophila field. -"in vivo" should be italicised consistently.

      We agree that clearer labelling of phenotype classes, panel annotations and expanded figure legends could improve the accessibility of Figure 5, particularly for readers less familiar with Drosophila wing phenotypes and genetic interaction assays. We would therefore be open to revising the presentation of this figure and its accompanying legends in a future revised version.

      We thank the reviewer for noting the typographical inconsistency of italics for in vivo. This will be corrected during manuscript revision and proofing.

      __Reviewer #2 __

      Reviewer #2 (Significance (Required)):

      I think this is nice little paper providing a simple, straightforward take-home message. It does not conceptually shake the world, and the evidence consists of (nice) correlations, with no direct proof put forward for the conclusions. I am not a Drosophila geneticist but probably rather an 'expert' on basic transcription mechanisms. I think the data in the paper are of high quality, if limited in scope, and that the conclusions are supported by the results, but I do not think the results or conclusions will have a big audience. Having said that, I found it interesting to learn about this group of repressors and their likely mode of action.

      On the other hand, it is worth emphasizing that proteins such as NELF and CDK9 would arguably be expected to be found at very many genes, as promoter-proximal pausing does exist at a plethora of genes, also genes that are house-keeping genes, ie not regulated by cell type or stimuli. So, lots of genes with pausing are not regulated by modulation of pausing. So, basically, the fact that knockdown of the repressor Groucho and loss of pausing is additive does not in my opinion necessarily mean that Groucho works by stabilizing pausing. Although it is admittedly a reasonably assumption, Groucho could also work by repressing transcription initiation; the genetic outcomes of 'double relief' would be the same, ie higher transcription levels. I think a brief comment to this effect might be appropriate, especially in the absence of (difficult to obtain) direct evidence that the transcription initiation step is not affected by Groucho.

      While we agree that the current study does not directly exclude possible effects of Groucho on transcription initiation, previously published work has already provided evidence arguing against repression by Groucho occurring primarily through inhibition of transcription initiation or prevention of pre-initiation complex assembly. Groucho-bound transcriptional start sites were previously shown to retain RNAP II occupancy, active chromatin features, and detectable basal transcriptional activity despite repression (Kaul et al., 2014).

      To acknowledge this possibility and explain why it is unlikely, we will add the sentence “While effects on transcription initiation cannot be completely excluded, previous work argues against Gro repressing transcription primarily through inhibition of transcription initiation. Gro-bound promoters remain accessible, overlap RNAP II occupancy, and retain active chromatin features and basal transcriptional activity” to the start of the third paragraph of the Discussion.

      Reviewer #3

      The methods section is lacking details on how ChIP-seq was performed in the BG3 cell line. The methods section does a good job of indicating how the data were processed. Information on the antibodies and conditions used is critical, as is whether spike-in controls were used.

      The generation of the ChIP-seq data from BG3 cells has already been published. __We will add the line “The production of ChIP-seq datasets for Gro binding in Kc167, S2R+ and BG3 cells has been described elsewhere (Kaul, Schuster and Jennings, 2014; Bar-Cohen et al., 2023)” in the Analysis of ChIP-seq data subsection of the Methods. __

      1. Description of analyses that authors prefer not to carry out

      • *

      __Reviewer #1 __ Major comments 1. The main weakness is the lack of a mechanistic link between Gro and the early elongation checkpoint. This is really the main point for this reviewer. The manuscript builds an interesting model, and the data support a functional connection between Gro and pausing-related factors, but the mechanistic link is absent. At present, the paper relies on co-localisation of ChIP peaks and genetic interaction in vivo. This is interesting and supportive, but with several possible interpretations. The title and some parts of the text are thus a bit stronger than what is directly demonstrated. Two possibilities could be proposed: either tone down the mechanistic claim or strengthen it experimentally. A more direct assay of pause release or productive elongation after Gro depletion at endogenous targets would be highly valuable. For example, Gro-KD followed by Pol II Ser2-P ChIP, or promoter vs. gene body analysis on Gro-bound genes, ideally comparing genes with Gro at TSS vs. not-TSS, would greatly support the proposed model. If the assay is established, this seems feasible in about 4 months.

      We thank the reviewer for this thoughtful comment. We agree that the current study does not directly measure genome-wide RNAP II pause release following Gro depletion. However, several key observations linking Gro with promoter-proximal pausing have already been published and are summarised in the Introduction. Previous work demonstrated that Gro occupancy correlates with paused genes and that depletion of Gro reduces RNAP II pausing and increases elongating RNAP II at the endogenous E(spl)mbeta-HLH locus, an established target gene of Groucho-mediated repression (Kaul et al., 2014; doi.10.1371/journal.pgen.1004595). We also note that several of the experiments proposed by the reviewer have already been addressed in previous work. Specifically, Kaul et al. (2014) demonstrated that Gro depletion increases elongating RNAP II (Ser2-P) at the endogenous E(spl)mbeta-HLH locus while total promoter-associated RNAP II occupancy remains largely unchanged. Promoter versus gene body analyses in that study further supported a role for Gro in regulating progression through the early elongation checkpoint rather than transcription initiation.

      The aim of the current manuscript was therefore to build upon these earlier mechanistic and genomic observations by asking whether the relationship between Gro and pausing-associated factors extends across multiple cell types and whether it has functional significance in vivo. By integrating comparative genomic analyses with sensitised developmental genetic assays in the wing, we provide evidence that Gro functionally interacts with multiple regulators of the early elongation checkpoint during development.

      The bioinformatic part could be strengthened on "distinct TF repertoires" between cell types.The authors interpret the cell type-specific Gro recruitment as reflecting distinct transcription factor repertoires in BG3, Kc167 and S2R+ cells. This is interesting, but not really shown. To make this point more strongly, the author could provide a map of TF expression across different cell types, especially for the TFs corresponding to the enriched motifs they discuss. Otherwise, this remains speculative.In line, the manuscript discusses enriched motifs in BG3 and compares them to Kc167 and S2R+ cells, but this remains a bit descriptive. A clearer side-by-side comparison would strengthen the paper. This is particularly relevant to the motifs used in interpreting cell type-specific recruitment.


      The interpretation that cell type-specific Gro recruitment reflects differences in transcription factor repertoires is based on several previously established observations already described in the manuscript. BG3 cells are derived from the larval CNS, whereas Kc167 and S2R+ cells are embryonic haemocyte-like lines (Cherbas et al., 2011; doi.10.1101/gr.112961.110). Transcriptomic analyses have further shown that these Drosophila cell lines maintain stable and distinct lineage-associated transcriptional identities, including differences in transcription factor expression (Cherbas et al., 2011). Given the diversity of transcription factors known to recruit Gro, the observed cell-type-specific binding patterns and motif enrichments are consistent with the distinct lineage-associated transcriptional programmes previously described for these cell lines.

      1. Several overlap analyses could be discussed more in depth. A few statements feel too strong for the actual percentages. For example, the GAF overlap in BG3 is around 51% genome-wide and 56% at TSS, which is meaningful, but not especially high. The text already states that it is not universal, and this point could be discussed more clearly.

      We note that the manuscript already explicitly states that overlap between Gro and GAF is not universal. Given the diversity of factors known to recruit Gro and the broad genomic distribution of GAF, we consider overlap frequencies of approximately 50% to represent a substantial association, particularly at transcription start sites. Importantly, the interpretation does not rely on complete co-occupancy between these factors, but rather on the observation that Gro-bound regions show significant enrichment for multiple factors associated with promoter-proximal pausing across different cell types.

      Similarly, for the UpSet plot, the wording around the "most frequent" combination could be toned down, because this is not a dominant pattern.

      The statement that the overlap between Gro, Nelf-E, GAF, Cdk9 and RNAP II represents the “most frequent” combination refers specifically to the relative frequency of the intersection categories within the UpSet analysis. In this context, the overlap between all five factors represents the largest intersection category identified (306 of 649 Gro peaks), with the next most frequent category containing substantially fewer peaks (90 of 649). We therefore feel that the current wording accurately describes the distribution observed in the analysis.

      More generally, I think the manuscript needs a clearer quantitative breakdown of TSS versus non-TSS peaks for the overlap analyses with NELF, GAF, Cdk9 and CycT. Several interpretations depend on this distinction, and right now, this is not always clear enough.

      The overlap analyses presented in Figure 3 explicitly distinguish between TSS and non-TSS peaks, and the corresponding quantitative overlap frequencies are described in the Results section. We do not consider that additional breakdowns are required for interpretation of the current data as this distinction is already incorporated into both the analyses and figure presentation.


      The "enhancer chromatin" interpretation is interesting, but not fully integrated with the genomic distribution. The observation that Gro is enriched in open enhancer-type chromatin is interesting and supports the idea that Gro does not act mainly through classical repressed chromatin. However, Gro peaks are also enriched at promoters and introns, and this reviewer feels that the manuscript does not fully connect these observations. Where are these enhancer-type peaks located exactly? Are they often intronic? Can this be correlated with the distribution of Gro peaks? This would help the reader and also strengthen the discussion because intronic Gro peaks are present in the data, but are not well integrated into the model.

      In the current manuscript, “enhancer chromatin” refers to chromatin states defined by combinations of enhancer-associated histone modifications, including H3K4me1, H3K27ac and H3K56ac as defined by Skalska et al.,2015 (doi.10.15252/embj.201489923), rather than exclusively to distal intergenic regulatory regions. As described in the chromatin-state analysis, these enhancer-associated chromatin signatures do occur at intronic regulatory regions, including regions classified as active intron chromatin. We therefore do not consider the enrichment of Gro peaks at promoters, enhancers and intronic regions to be mutually exclusive observations within this framework.

      Intronic enhancer localisation is common in Drosophila, where the compact organisation of the genome results in many developmental regulatory elements residing within introns (Arnold et al., 2013; doi.10.1126/science.1232542). We therefore consider the presence of Gro peaks within intronic regions to be fully consistent with the observed enrichment of Gro binding within enhancer-associated chromatin states.

      The in vivo part is a strength, but some important points need clarification.The in vivo section is a clear highlight of the manuscript. It gives functional relevance to the model and moves the paper beyond cell-culture correlations. That said, a few points need to be clearer:-RNAi efficiency is not clear for the tested genes, especially the pausing factors. This is important because the differential effects between NELF subunits could simply reflect differences in knockdown efficiency.

      While differences in RNAi efficiency could potentially contribute to variation in phenotype strength between individual knockdowns, multiple biological explanations could also account for the differing effects observed between NELF subunits, including differences in protein stability, residual complex activity, or subunit-specific functions. Importantly, the central conclusion of the manuscript does not depend on quantitative comparison of phenotype strength between individual NELF components, but rather on the observation that perturbation of multiple pausing-associated factors genetically interacts with Gro in vivo.

      If RNAi validation is possible with existing reagents, this seems realistic within 3 months.

      The manuscript focuses on the genetic interactions observed between Gro and pausing-associated factors in vivo rather than on quantitative comparison between individual RNAi lines. As no specific validation experiments were proposed, we are not currently planning additional RNAi validation analyses for the present study.

      The discussion could be expanded, especially because the mechanism is not fully shown.Since the direct mechanism is still missing, the discussion could compensate. Right now, the proposed model is interesting, but it still leaves many open questions. For example:-Is Gro affecting the recruitment or activity of elongation factors?-Could looping or enhancer-promoter communication contribute?-How should the intronic Gro peaks be interpreted in the model?-In the wing, could the phenotype be discussed more mechanistically, in light of what is already known about Gro and derepression of vein-promoting genes?For example, a model figure could help here.


      We thank the reviewer for these thoughtful suggestions.

      Several of the points raised by the reviewer are discussed in the manuscript already. For example, we discuss the possibility that Gro influences the activity or recruitment of elongation-associated factors. We agree that enhancer-promoter communication and chromatin looping are a plausible component of this mechanism. As the Drosophila genome is compact and intronic enhancers are highly prevalent, topological looping provides a clear physical framework for how Gro molecules distributed at non-TSS sites regulate promoter-proximal machinery. Indeed, we have previously published this model (Kaul, Schuster, and Jennings, 2015; see Figure 1C; doi.10.1080/21541264.2014.1000709). Our current in vivo and genomic findings build directly upon this model, suggesting that within these established looped configurations, Gro acts locally to interface with and stabilize the pausing machinery.

      With respect to the wing phenotypes, the Discussion focuses primarily on the interpretation of the observed genetic interactions between Gro and pausing-associated factors rather than on defining the precise downstream target genes contributing to vein phenotypes. We agree that additional mechanistic dissection of these developmental phenotypes would be interesting. However, this would require a substantial expansion of the study into the detailed developmental and signalling mechanisms underlying vein specification, which lies beyond the primary focus of the current manuscript.

      OPTIONAL: It would be interesting to know whether the same peak distribution / functional logic is observed in mammalian TLE orthologs. This is not essential for the current conclusions, but it would broaden the impact.

      Determining whether similar genomic distributions and functional relationships are conserved for mammalian TLE orthologues will be an important future project. However, relatively little comparable genome-wide TLE occupancy data are currently available, meaning that such analyses would require a substantial independent undertaking beyond the scope of the present study.

      Minor comments -Please explain why promoters were defined as {plus minus}250 bp from the TSS. This seems rather narrow.

      Promoters were defined as ±250 bp from annotated transcription start sites. This window size is commonly used in Drosophila genomic studies, where the compact organisation of the genome means that broader windows frequently overlap adjacent genes.

      -Please clarify why S2R+ cells are included in the comparative part but are not followed in the same way in some downstream analyses.

      S2R+ cells were included in the comparative analyses to determine which aspects of Gro recruitment were shared across multiple cell types and which were cell-type specific. Some downstream analyses focused on BG3 and Kc167 cells because these lines had the most extensive corresponding datasets available for the chromatin and pausing-factor analyses performed in the current study.

      __Reviewer #3 __ Here Martínez Quiles and Jennings investigate the role of the Groucho repressor in BG3 cells. This extends a previous study that used S2R+ cells, published previously by one of the authors, as well as Kc167 cells. They find that Gro is recruited to gene promoters in a cell-type-specific manner. Gro associates with open chromatin, is mostly associated with enhancer regions, and is primarily excluded from regions of the genome that are repressed by Polycomb. After studying its function in cell culture, the authors investigate the role of Gro in a wing-specific background. The findings here are mostly correlative, showing that loss of Gro results in stronger phenotypic defects when combined with loss of factors including NELF-B or NELF-D, LARP7, and bin3. They propose that Gro acts to attenuate gene expression during early gene expression. This claim would be greatly strengthened if the authors provided RNA-seq data in addition to the ChIP-seq data shown in this manuscript, especially to examine gene expression patterns among the different cell lines studied here. At present, this is a correlative study that does not illuminate the mechanism of Gro in directly regulating promoter-proximal pausing or RNA polymerase behavior.

      We thank the reviewer for this suggestion. However, extensive transcriptomic analyses of Drosophila cell lines, including Kc167, S2R+ and BG3-derived lines, have already been published (Cherbas et al., 2011), together with RNA-seq analyses following Gro depletion (Kaul et al., 2014). In addition, the association between Gro occupancy and paused genes has also been reported previously (Kaul et al., 2014; Chambers et al., 2017; doi. 10.1186/s12864-017-3589-6).

      While additional RNA-seq analyses could further characterise transcriptional differences between cell lines, RNA-seq alone would not directly determine whether altered transcript levels arise specifically through changes in promoter-proximal pausing, as opposed to effects on transcription initiation, transcript stability, or indirect downstream regulatory effects. We therefore do not consider additional RNA-seq analyses necessary to support the central conclusions of the present study.

      Figure 2-3: For the ChIP-seq data, scale the y-axis in the same manner to better understand enrichment between the samples.

      These ChIP-seq datasets were generated independently using different antibodies and experimental conditions, direct comparison of enrichment magnitudes across datasets would not be biologically meaningful. Accordingly, our analyses focus on significant peak calls and overlap relationships rather than relative signal intensity. Applying identical y-axis scaling across all tracks would obscure significant enrichment in several datasets and could therefore be misleading.

      RNA-seq data between different cell lines would greatly enhance the authors findings or Pro-Seq to really show a relationship with Gro binding and promoter proximal pausing.

      We note that RNA-seq datasets for Gro depletion in Kc167 and S2R+ cells have already been published previously (Kaul et al., 2014), together with evidence linking Gro occupancy to paused genes (Kaul et al., 2014; Chambers et al., 2017). We therefore do not consider that additional RNA-seq analysis would substantially strengthen the central conclusions of the current manuscript.

      Moreover, RNA-seq alone cannot distinguish if altered transcript abundance reflects changes in promoter-proximal pausing from other mechanisms influencing transcript abundance. While PRO-seq approaches could provide further mechanistic information regarding RNAPII dynamics, such experiments are beyond the scope of the present study.

      This study helps to further clarify how Gro binds DNA in different cell types and indicates that may intersect with factors involved in promoter proximal pausing. The study is highly correlative and would require additional work to show a mechanistic link between Gro and transcription attenuation due to promoter proximal pausing.

      While we agree that PRO-seq approaches could provide additional mechanistic information regarding RNAPII dynamics, establishing an appropriate experimental and analytical framework for these analyses would require a substantial extension beyond the scope of the present study. In addition, several aspects of the relationship between Gro occupancy, transcriptional repression, and promoter-proximal pausing that underpin these suggestions have already been addressed in previously published work, including RNA-seq analyses following Gro depletion (Kaul et al., 2014), evidence linking Gro occupancy with paused genes (Kaul et al., 2014; Chambers et al., 2017), and studies demonstrating that Gro-mediated repression does not occur through inhibition of pre-initiation complex assembly. The current manuscript is therefore intended to build upon these existing findings by integrating comparative genomic analyses with new in vivo genetic interaction data.

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Here Martínez Quiles and Jennings investigate the role of the Groucho repressor in BG3 cells. This extends a previous study that used S2R+ cells, published previously by one of the authors, as well as Kc167 cells. They find that Gro is recruited to gene promoters in a cell-type-specific manner. Gro associates with open chromatin, is mostly associated with enhancer regions, and is primarily excluded from regions of the genome that are repressed by Polycomb. After studying its function in cell culture, the authors investigate the role of Gro in a wing-specific background. The findings here are mostly correlative, showing that loss of Gro results in stronger phenotypic defects when combined with loss of factors including NELF-B or NELF-D, LARP7, and bin3. They propose that Gro acts to attenuate gene expression during early gene expression. This claim would be greatly strengthened if the authors provided RNA-seq data in addition to the ChIP-seq data shown in this manuscript, especially to examine gene expression patterns among the different cell lines studied here. At present, this is a correlative study that does not illuminate the mechanism of Gro in directly regulating promoter-proximal pausing or RNA polymerase behavior.

      Major comments:

      Figure 2-3: For the ChIP-seq data, scale the y-axis in the same manner to better understand enrichment between the samples.

      The methods section is lacking details on how ChIP-seq was performed in the BG3 cell line. The methods section does a good job of indicating how the data were processed. Information on the antibodies and conditions used is critical, as is whether spike-in controls were used.

      RNA-seq data between different cell lines would greatly enhance the authors findings or Pro-Seq to really show a relationship with Gro binding and promoter proximal pausing.

      Significance

      This study helps to further clarify how Gro binds DNA in different cell types and indicates that may intersect with factors involved in promoter proximal pausing. The study is highly correlative and would require additional work to show a mechanistic link between Gro and transcription attenuation due to promoter proximal pausing.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper describes experiments designed to determine the mechanism of repression by the Groucho co-repressor in flies. The authors first characterize DNA binding by Groucho by ChIP-Seq analysis. This turns out to be consistent with recruitment driven by cell-type specific transcription factors. Nevertheless, its distribution across genomic features is similar across cell types, with enrichment in promoters and introns. It appears to bind in regions otherwise transcriptionally active (ie 'open chromatin'), rather than chromatin that is compacted and repressed. This suggest that Groucho regulates transcription through promoters or promoter-proximal pausing rather than by reducing chromatin accessibility. Groucho binding overlaps with NELF and GAF binding, seemingly consistent with a role in regulating pausing. However, Gro binding was also observed at promoters where P-TEFb components are detected, arguing against Gro repressing transcription P-TEFb exclusion from pausing sites. The authors next switched to investigating the consequences of Groucho kd and tested the idea that co-depletion of pausing factors might inform about the manner of gene repression, the idea being that if Groucho attenuates transcription by promoting or stabilizing promoter proximal pausing, then partial reduction of the pausing factors it affects should enhance the Groucho knock-down phenotype. Interestingly, simultaneous knock-down of Groucho and GAF resulted in enhanced patterning defects relative to Groucho knock-down alone, with the severity of the phenotypes resembling that observed upon increasing Groucho knock-down. Similarly, the knock-down of either Nelf-B or Nelf-D significantly enhanced Groucho phenotype. Finally, Kd of regulators of the pausing regulator CDK9 were tested. The 7SK snRNA complex inhibits CDK9, so any treatment leading to less 7SK will free CDK9 to positively affect pausing release. Larp kd fits that category as it directly leads to less 7SK and thus more CDK9 activity, while Bin3 kd results in less 5'-methyl capping, and thus more 7SK destabilization (less 7SK), again freeing CDK9 from inhibition - so, increasing pause release (like Nelf kd). Gratifyingly, this separate way of de-regulating/decreasing pausing again had an additive effect to Groucho depletion. Together, these genetic data thus overall support the idea that the (non-chromatin regulating) repressor Groucho works by stabilizing pausing complexes at specific genes.

      Significance

      I think this is nice little paper providing a simple, straightforward take-home message. It does not conceptually shake the world, and the evidence consists of (nice) correlations, with no direct proof put forward for the conclusions. I am not a Drosophila geneticist but probably rather an 'expert' on basic transcription mechanisms. I think the data in the paper are of high quality, if limited in scope, and that the conclusions are supported by the results, but I do not think the results or conclusions will have a big audience. Having said that, I found it interesting to learn about this group of repressors and their likely mode of action.

      On the other hand, it is worth emphasizing that proteins such as NELF and CDK9 would arguably be expected to be found at very many genes, as promoter-proximal pausing does exist at a plethora of genes, also genes that are house-keeping genes, ie not regulated by cell type or stimuli. So, lots of genes with pausing are not regulated by modulation of pausing. So, basically, the fact that knockdown of the repressor Groucho and loss of pausing is additive does not in my opinion necessarily mean that Groucho works by stabilizing pausing. Although it is admittedly a reasonably assumption, Grouch could also work by repressing transcription initiation; the genetic outcomes of 'double relief' would be the same, ie higher transcription levels. I think a brief comment to this effect might be appropriate, especially in the absence of (difficult to obtain) direct evidence that the transcription initiation step is not affected by Groucho.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this manuscript entitled "the co-repressor Groucho limits progression through the early transcription elongation checkpoint in vivo", the authors study how the co-repressor Groucho (Gro) may repress transcription in Drosophila. They combine Gro ChIP-seq analysis in BG3 cells with published data from Kc167 and S2R+ cells, chromatin-state and overlap analyses with pausing/elongation factors, and functionally link these interactions in vivo by genetic interaction assays in the wing. The manuscript shows that Gro recruitment is largely cell type-specific, while Gro binding is detected as discrete peaks with similar genomic distribution across cell types. Gro peaks are enriched in open enhancer-type chromatin and overlap with factors linked to promoter-proximal pausing. In vivo, knock-down (KD) of several pausing-related factors enhances the gro RNAi phenotype in the wing. Overall, this is a nice, clean study with interesting data, and the in vivo findings are clearly a strength. However, the mechanistic link between Gro and the early elongation checkpoint remains unclear, and several bioinformatics and presentation points could be strengthened.

      Major comments

      1. The main weakness is the lack of a mechanistic link between Gro and the early elongation checkpoint. This is really the main point for this reviewer. The manuscript builds an interesting model, and the data support a functional connection between Gro and pausing-related factors, but the mechanistic link is absent. At present, the paper relies on co-localisation of ChIP peaks and genetic interaction in vivo. This is interesting and supportive, but with several possible interpretations. The title and some parts of the text are thus a bit stronger than what is directly demonstrated. Two possibilities could be proposed: either tone down the mechanistic claim or strengthen it experimentally. A more direct assay of pause release or productive elongation after Gro depletion at endogenous targets would be highly valuable. For example, Gro-KD followed by Pol II Ser2-P ChIP, or promoter vs. gene body analysis on Gro-bound genes, ideally comparing genes with Gro at TSS vs. not-TSS, would greatly support the proposed model. If the assay is established, this seems feasible in about 4 months.
      2. The bioinformatic part could be strengthened on "distinct TF repertoires" between cell types. The authors interpret the cell type-specific Gro recruitment as reflecting distinct transcription factor repertoires in BG3, Kc167 and S2R+ cells. This is interesting, but not really shown. To make this point more strongly, the author could provide a map of TF expression across different cell types, especially for the TFs corresponding to the enriched motifs they discuss. Otherwise, this remains speculative. In line, the manuscript discusses enriched motifs in BG3 and compares them to Kc167 and S2R+ cells, but this remains a bit descriptive. A clearer side-by-side comparison would strengthen the paper. This is particularly relevant to the motifs used in interpreting cell type-specific recruitment.
      3. Several overlap analyses could be discussed more in depth. A few statements feel too strong for the actual percentages. For example, the GAF overlap in BG3 is around 51% genome-wide and 56% at TSS, which is meaningful, but not especially high. The text already states that it is not universal, and this point could be discussed more clearly. Similarly, for the UpSet plot, the wording around the "most frequent" combination could be toned down, because this is not a dominant pattern. More generally, I think the manuscript needs a clearer quantitative breakdown of TSS versus non-TSS peaks for the overlap analyses with NELF, GAF, Cdk9 and CycT. Several interpretations depend on this distinction, and right now, this is not always clear enough.
      4. The "enhancer chromatin" interpretation is interesting, but not fully integrated with the genomic distribution. The observation that Gro is enriched in open enhancer-type chromatin is interesting and supports the idea that Gro does not act mainly through classical repressed chromatin. However, Gro peaks are also enriched at promoters and introns, and this reviewer feels that the manuscript does not fully connect these observations. Where are these enhancer-type peaks located exactly? Are they often intronic? Can this be correlated with the distribution of Gro peaks? This would help the reader and also strengthen the discussion because intronic Gro peaks are present in the data, but are not well integrated into the model.
      5. The in vivo part is a strength, but some important points need clarification. The in vivo section is a clear highlight of the manuscript. It gives functional relevance to the model and moves the paper beyond cell-culture correlations. That said, a few points need to be clearer:
        • RNAi efficiency is not clear for the tested genes, especially the pausing factors. This is important because the differential effects between NELF subunits could simply reflect differences in knockdown efficiency.
        • The figures and text could lay out the logic of the genetic interactions for non-Drosophila readers. For example, the comparison of single and double copies of Gro-RNAi to combinatorial knockdowns, when it is additive, and when it is interpreted as synergistic.
        • The statistical analysis of the phenotype distributions should be shown more clearly (Fig. 5B). If RNAi validation is possible with existing reagents, this seems realistic within 3 months.
      6. The discussion could be expanded, especially because the mechanism is not fully shown. Since the direct mechanism is still missing, the discussion could compensate. Right now, the proposed model is interesting, but it still leaves many open questions. For example:
        • Is Gro affecting the recruitment or activity of elongation factors?
        • Could looping or enhancer-promoter communication contribute?
        • How should the intronic Gro peaks be interpreted in the model?
        • In the wing, could the phenotype be discussed more mechanistically, in light of what is already known about Gro and derepression of vein-promoting genes? For example, a model figure could help here.

      OPTIONAL:

      It would be interesting to know whether the same peak distribution / functional logic is observed in mammalian TLE orthologs. This is not essential for the current conclusions, but it would broaden the impact.

      Minor comments

      • Please explain why promoters were defined as {plus minus}250 bp from the TSS. This seems rather narrow.
      • Please clarify why S2R+ cells are included in the comparative part but are not followed in the same way in some downstream analyses.
      • Figure 5 would gain clarity if the phenotype classes/panel letters were shown more clearly on the images.
      • The legends of the wing figures should be expanded, especially for readers outside the Drosophila field.
      • "in vivo" should be italicised consistently.

      Referee cross-commenting

      My main concerns are broadly echoed by Reviewer 2, notably regarding the need to clarify the level of mechanistic support for the proposed model. Reviewer 3 also raises related points about the correlative nature of the evidence. Overall, I think the reports converge on the need to better align the conclusions with the current data, while recognising the value of the functional in vivo results, though with different levels of requested additional analysis.

      Significance

      General assessment

      This is a nice paper, with clean data and an interesting model. The strongest point is the attempt to connect the Gro genomic localisation with functional interaction in a developmental context. The observation that Gro is found in open enhancer-type chromatin, together with the in vivo genetic interactions, makes the study significant. The main limitation is that the mechanistic link is still missing. Overall, this reviewer finds the study convincing as a functional and descriptive paper but less convincing as a mechanistic one.

      Advance

      The study extends previous work on Gro by comparing several cell types and by adding in vivo genetic data in the wing. The main advance is thus conceptual and functional: it supports the idea that Gro acts in concert with the pausing/elongation machinery rather than simply through repressed chromatin. However, the mechanistic advance remains limited because a direct link to the early elongation checkpoint has not yet been demonstrated. This is the main thing preventing the paper from being stronger.

      Audience

      This reviewer feels that the manuscript will mainly interest a specialised basic research audience: scientists working on transcriptional regulation, co-repressors, RNA polymerase II pausing, chromatin regulation, and Drosophila developmental genetics. It can also be relevant to those broadly interested in Gro/TLE biology.

      Expertise

      This reviewer's expertise includes gene regulation and its nuclear organisation, transcriptional/co-transcriptional and post-transcriptional regulations, transcription factors biology, and Drosophila genetics. This reviewer is comfortable evaluating the developmental genetics, the conceptual aspect, and the interpretation of genomic analyses, but has less competence in evaluating bioinformatic ChIP-seq processing pipelines.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03227R

      Corresponding author(s): Dr. David Skerrett-Byrne & Prof. Brett Nixon

      1. General Statements

      We are grateful to the reviewers and editorial team for their thoughtful and constructive evaluation of our manuscript. The comments provided were insightful and have substantially strengthened the rigor, clarity, and presentation of the study. In response, we have carefully revised the manuscript throughout, including clarification of conceptual interpretations, expansion of methodological detail, refinement and condensation of the Discussion, as well as addition of new supplementary analyses and figures. Collectively, we believe these revisions have improved both the transparency and accessibility of the work while reinforcing the central conclusions of the study.

      At its core, this study sought to address a major unresolved question in reproductive biology: how spermatozoa, which are transcriptionally and translationally inert, achieve functional competence during post-testicular maturation. Using deep, stage-resolved phosphoproteomics integrated with functional validation approaches, we demonstrate that the majority of sperm phosphoproteomic remodelling occurs during epididymal maturation rather than during capacitation, challenging long-standing paradigms in the field. Beyond generating one of the deepest sperm phosphoproteomic resources currently available (>14,000 phosphosites), the study also provides functional and physiological context through kinase inhibition studies, in vivo knockout phenotypes, and the development of the ShinySpermPhospho online resource to facilitate community access and future discovery.

      Importantly, through the review process we have worked carefully to ensure that the manuscript more clearly distinguishes data-driven conclusions from hypothesis-generating interpretations, particularly in areas relating to kinase prediction, metabolic regulation, and phosphoproteomic remodelling. We believe the revised manuscript now presents a more balanced and rigorous framework while preserving the significance of the central findings.

      Overall, we hope the revised manuscript now provides a valuable resource and conceptual advance for the reproductive biology community, with implications extending from fundamental sperm cell biology to translational opportunities in male infertility and contraceptive development.

      2. Point-by-point description of the revisions

      REVIEWER #1

      The manuscript by Skerrett-Byrne and collaborators represents a comprehensive and technically sophisticated phosphoproteomics study. Using high-resolution mass spectrometry on mouse sperm obtained from the caput and cauda regions of the epididymis, both before and after capacitation, the authors generated a more complete database of phosphorylation changes in these cells. One of the most interesting outcomes is that most of these changes occur during sperm maturation, rather than sperm capacitation. The work is important and relevant, and the information obtained could be valuable for reproductive biologists working in basic science, as well as for the identification of novel contraceptive targets.

      __Answer: __We thank the reviewer for their positive assessment of our work and for recognising the value of the datasets we have generated for supporting future innovations in both fundamental reproductive biology and the identification of novel contraceptive targets. We are also delighted that the reviewer has recognised the significance that, contrary to previously thought, the majority of the phosphorylation changes we report occur during epididymal maturation, rather than subsequently during capacitation.

      • The title should include a reference to sperm capacitation, as most of the study focuses on comparisons between epididymal maturation and capacitation, and the functional experiments are based on the latter. __Answer: __We thank the reviewer for this suggestion and have revised the title to reflect the importance of our focus on both phases of post-testicular sperm maturation, namely epididymal sperm maturation and sperm capacitation (please see line 1).

      • Considering the newly reported changes in phosphosites, it would be desirable to include validation at the individual protein level for at least a few examples, using an independent technique such as western blotting. __Answer: __We thank the reviewer for this thoughtful suggestion and fully appreciate the motivation to seek orthogonal validation of phosphoproteomic findings. However, we respectfully wish to express our reservations regarding the use of antibody-based validation of site-specific phosphorylation events, a technique that is increasingly being recognised as problematic and, in many cases, less reliable than modern MS-based approaches (Nature, PMID: 39506148). Indeed, high-resolution mass spectrometry provides direct, site-resolved identification and quantification of phosphorylation events with substantially greater specificity, accuracy, and proteoform resolution than antibody-based methods. For this reason, MS-based phosphoproteomics is now widely regarded as the gold standard for mapping phosphorylation dynamics.

      With regard to the use of antibodies, many commercially available phospho-specific antibodies lack sufficient site specificity, often have poorly defined or undocumented epitope recognition, and frequently fail to discriminate between closely related proteoforms or neighbouring phosphorylation sites. Indeed, recent large-scale evaluations have demonstrated that many widely used antibodies do not reliably bind their intended targets, raising concerns about reproducibility and interpretability across the biomedical sciences (PMID: 37995198). In one study, testing the utility of >600 antibodies, two thirds failed to work as described (PMID: 37995198), while the literature also features other studies (e.g. PMID: 31612854) reporting that certain antibodies (SC-138763) do not bind their stated target despite having been "used in 15 published manuscripts to ascribe specific properties to the protein in normal and disease states", collectively cited >3,000 times.

      Accordingly, while we recognise the importance of independent validation, we contend that antibody-based validation may not be the most appropriate strategy to improve the robustness of the conclusions in this study. It is for this reason that we elected to strengthen confidence in our findings through multiple complementary approaches, including rigorous statistical filtering, extensive in-silico pathway and kinase analyses, selective pharmacological inhibition of target proteins, and in vivo functional interrogation using knockout mouse models. Together, these orthogonal strategies provide additional biological validation linking the reported phosphorylation changes to aspects of sperm function.

      We have clarified this rationale in the revised manuscript and briefly expanded the discussion to touch on these methodological strengths and limitations (please see lines 754 - 759).

      • In the knockout models, it is not possible to distinguish between defects in spermatogenesis and those arising during maturation or capacitation. A parameter directly related to spermatogenesis should therefore be included, for example, testicular weight or histology, sperm number, and sperm morphology. __Answer: __We thank the reviewer for raising this important point. We agree that systemic knockout models do not allow definitive discrimination between defects arising during spermatogenesis versus those occurring downstream during post-testicular sperm maturation or capacitation. Unfortunately, the additional parameters suggested by the reviewer do not form part of the standardised phenotyping pipeline implemented by the International Mouse Phenotyping Consortium (IMPC) and the European Mouse Mutant Archive (EMMA). As such, these data are not available for the knockout lines examined in this study and cannot be retrospectively generated. We have therefore clarified this limitation more explicitly in the revised manuscript and have framed the knockout data as physiological validation concerning the functional relevance of the parent protein rather than as definitive evidence of stage-specific or phosphorylation-dependent mechanisms of action. Importantly, the consistency of impaired sperm motility and fertilisation outcomes across multiple independent knockout lines supports the biological importance of the parent proteins identified, while acknowledging that the precise developmental window of their action remains to be resolved. While we regrettably concede that it is beyond the scope of this study, we do acknowledge that future studies will be required to dissect these mechanisms with greater resolution, ideally using germ cell-specific or temporally controlled knockout models, or targeted manipulation of key phosphoproteins and/or their phosphorylation motifs. Such approaches will be essential if we are to be able to distinguish roles of target proteins in spermatogenesis from those that occur downstream during epididymal maturation and capacitation (please see lines 725 - 733).

      • Error values, sample size, and statistical analyses are missing from Figure 7 and should be provided for clarity. __Answer: __We apologise for this omission and have now updated Figure 7 and its legend to include sample sizes, error values, and details of the statistical analyses used, thereby improving clarity and reproducibility of these data.

      In addition, we have clarified that sperm functional data derived from EMMA knockout lines are generated from cryopreserved samples comprising pooled cauda epididymal spermatozoa collected from 10 heterozygous males per line (PMID: 17709347, 38839949). As such, each data point represents a pooled biological sample, consistent with standardised EMMA/INFRAFRONTIER protocols (PMID: 25414328, 27262858, 38839949). Where appropriate, we have also included additional reproductive metrics at the level of IVF cycle (where available) and individual litters, including average litter size and fetal sex distribution (with exceptions for specific lines where such data are not available). These details are now captured in both the Methods and relevant figure legends (please see lines 453 - 457, 1228 - 1234, 1431 - 1434, 1533 - 1536, 1572 - 1575, Figure 7 & S6).


      REVIEWER #2

      In this manuscript, the authors examine dynamic modifications of the sperm phosphoproteome during epididymal transit and capacitation. They compare three distinct populations differing in anatomical localization and activation status: caput sperm, non capacitated cauda sperm, and capacitated cauda sperm. Using high resolution tandem mass spectrometry, they reveal that phosphorylation changes during epididymal passage are far more extensive than previously appreciated. These findings are further validated in genetically modified animal models, where disruption of selected genes encoding for phosphoproteins results in marked defects in sperm motility and fertilization capacity.

      __Answer: __We thank the reviewer for their positive and thoughtful evaluation of our study and for recognising both the depth of the phosphoproteomic dataset and the importance of the functional validation experiments; sentiments that we whole heartly agree with.

      • Throughout the text, and particularly in the paragraph entitled 'Epididymal maturation accounts for the majority of maturation associated sperm cell signaling,' it seems that phosphorylation is interpreted as inherently activatory and dephosphorylation as inhibitory (lines 248-252). Since this relationship is not universally applicable, it would be valuable to address this issue at the outset of the paragraph and to discuss how phosphorylation events are context dependent in their effects on protein function. __Answer: __We thank the reviewer for highlighting this important conceptual point. We fully agree that phosphorylation is not inherently activatory, nor is dephosphorylation necessarily inhibitory, and that the functional consequences of phosphorylation are highly context dependent. We have revised the indicated paragraph to explicitly acknowledge this at the outset to ensure that phosphorylation changes are interpreted as regulatory rather than intrinsically directional (please see lines 219 - 222).

      • Lines 392-393: the claim that "the introduction of each inhibitor to populations of capacitating spermatozoa led to a significant reduction..." is not fully supported by data and should be toned down. In fact, two out of three inhibitors, do not significantly affect the acrosome reaction. __Answer: __We thank the reviewer for this careful assessment and agree that the original wording overstated the nuances of the effects of individual inhibitors. We have revised the text to explicitly report the corresponding p-values and to distinguish between statistically significant and non-significant trends. Specifically, inhibition of PAK1 produced a statistically significant reduction in the acrosome reaction, whereas inhibition of STK33 (p = 0.0574) and HIPK4 (p = 0.0911) resulted in consistent, but non-significant, reductions. Importantly, combined inhibition of all three kinases yielded a robust and statistically significant suppression of acrosomal exocytosis. The revised wording now accurately reflects the quantitative data (please see lines 419 - 424, 700 - 703).

      • The discussion section, spanning 11 pages, is overly long and contains considerable repetition. I recommend transferring the detailed description of experiments to the 'Results' section and using the discussion primarily to synthesize and highlight the novel findings while limiting speculative content. For example, the content in lines 509-530 could be condensed and relocated to the Results. Likewise, other detailed examples would be more appropriately presented within their respective result paragraphs. __Answer: __We thank the reviewer for this constructive feedback. We agree that the Discussion was overly long and on reflection does contain some unnecessary repetition. In response, we have substantially condensed the Discussion (shorten by 641 words), relocated and shorten detailed descriptions of experimental observations to the Results section where appropriate (including the suggestion made), and focused the revised Discussion on synthesis of the key findings and their broader implications. We should note, to address certain review comments, this require further additions to the discussion but we have endeavour to keep this brief (please see lines 309 - 326 and throughout the discussion).

      • Minor points:

      • To improve reproducibility, the suppliers of all reagents should be specified together with their catalogue numbers
      • Figure 7: it is unclear which data are statistically significant
      • Figure 7B: fertilization capacity should be assessed at an earlier stage, as the cleavage rate to 2-cell embryo may be affected by factors unrelated to the sperm ability to fertilize

      __Answer: __We thank the reviewer for these suggestions. We have now added supplier information and catalogue numbers for all reagents to the Methods section to improve reproducibility (please see lines 1256 - 1257, 1295, 1299, 1310, 1333 - 1334, 1343 - 1344, 1388 - 1389, 1401, 1406). We have revised Figure 7 and its legend to clearly indicate statistically significant differences, including sample sizes and statistical tests used. Lastly, we agree that assessment at earlier fertilisation stages would complement our featured assessment of sperm fertilisation competence. Regrettably, all IVF data were generated via standardised and unbiased IMPC/EMMA pipelines. As such, cleavage rate to the 2-cell stage represents the earliest uniformly available endpoint across all knockout lines. We have clarified this limitation in the revised manuscript (please see lines 723 - 724).


      REVIEWER #3

      This is technically sophisticated phosphoproteomic study of mouse sperm maturation across the epididymis and during capacitation. The dataset is deep (>14,000 phosphosites) and the analyses integrate high-resolution MS, immunofluorescence, IPA, kinase mapping, pharmacological inhibition, and knockout mouse models. The manuscript represents a nice resource for the field. However, several issues limit clarity, mechanistic interpretation, and robustness of the conclusions. In particular, the manuscript's scope is extremely large, making some conclusions insufficiently supported, and some analyses require better control, methodological transparency, or deeper mechanistic connection. It gives the impression that some mechanistic data was added to descriptive data in order to increase the manuscript's impact, although the current mechanistic data is not convincing.

      Major concerns

      1. Conceptual Overreach - "Epididymal maturation accounts for 86% of phosphorylation changes" The manuscript repeatedly emphasizes that epididymal maturation causes the majority of phosphoproteomic remodeling. While the data indeed show large quantitative differences, several conceptual issues remain:

        • The caput vs. cauda comparison includes differences in protein abundance, not only phosphorylation*
        • Many phosphosites lost in the cauda may reflect protein loss, not dephosphorylation (the authors acknowledge this, but quantitative controls are insufficient)*
        • The normalization method for phosphopeptide abundance vs total protein abundance is needed*
        • It is unclear whether the same amount of starting material and equal protein loading were used across stages I would suggest to perform (or explicitly describe) normalization using matched proteome intensities. Provide supplementary plots showing phosphosite/parent-protein normalization to avoid overinterpreting phosphosite loss as dephosphorylation*

      __Answer: __We thank the reviewer for this important and constructive critique and agree that interpretation of phosphoproteomic changes during epididymal maturation must carefully consider concurrent remodelling of the underlying sperm proteome.

      To directly address the concern that phosphosites lost in the cauda may reflect protein loss, not dephosphorylation, we have now explicitly compared these phosphoproteins lost during caput-to-cauda transit with proteins shown to be lost or reduced over the same maturation window in a previously published matched proteomic analysis of the same sperm populations. This comparison revealed that 527 phosphoproteins, out of a total of 966 phosphoproteins lost, overlapped with proteins lost during epididymal maturation, while a further 88 phosphoproteins aligning with proteins exhibiting reduced abundance during transit. While these data indicate that a subset of phosphosite loss can be attributed to complete loss of the parent protein, the remaining phosphoproteomic changes (45.4%) cannot be fully explained by protein disappearance alone and are therefore consistent with extensive phosphoproteomic remodelling. We have documented this information in a new panel of Supplementary Figure 1 (Figure S1B) and the corresponding text has been revised accordingly (please see lines 182 - 188).

      With respect to normalisation strategies, we respectfully note that normalisation of phosphopeptide intensities to total protein abundance is not universally accepted in large-scale phosphoproteomic analyses (PMID: 30190555, 34857927, 38576152), particularly in systems undergoing extensive proteome remodelling such as maturing spermatozoa. In many contexts, including our own previous work, phosphoproteomic analyses are performed on equal protein input and interpreted at the level of phosphopeptide abundance, with functional relevance established through orthogonal biological validation rather than ratio-based correction to total protein levels.

      Lastly, all samples in this study were diluted to equal total protein amounts prior to phosphopeptide enrichment, ensuring consistent input material across all sperm populations (originally noted in the manuscript, please see line 1339). We have now clarified this explicitly in the Results section to ensure this is not missed (please see lines 146 - 147). Moreover, our conclusions are supported by independent in-silico analyses, pharmacological inhibition studies, and in vivo knockout models, collectively providing functional validation that extends beyond phosphosite quantification alone.

      Finally, to address concerns regarding potential conceptual overreach, we have revised the language surrounding the statement that epididymal maturation accounts for ~86% of phosphorylation changes to ensure precise interpretation. Specifically, we have clarified that this value refers to the proportion of statistically significant differences in phosphopeptide abundance detected across maturation stages, to avoid implying direct measurement of net enzymatic dephosphorylation (please see lines 519 - 520).

      Importantly, having addressed the reviewer's concerns detailed above, we believe the data do support the conclusion that the majority of sperm phosphoproteomic remodelling occurs during epididymal maturation rather than during capacitation. While we have tempered our language to improve clarity, the central quantitative observation that epididymal transit represents the dominant phase of phosphoproteomic remodelling remains supported by the revised analyses.

      • Capacitation analysis is underpowered and oversimplified*

      The authors state that capacitation leads to "modest" changes. However:

        • The capacitation protocol uses dibutyryl-cAMP + pentoxifylline, which may bypass early physiological signaling. This is a important red flag __Answer: __We thank the reviewer for this important point and agree that the choice of capacitation conditions influences the nature and magnitude of signalling events detected. The use of dibutyryl cAMP and pentoxifylline represents a well-established and widely adopted experimental model to induce robust and synchronised capacitation-associated signalling in mouse spermatozoa, acting specifically via the activation of the canonical cAMP/PKA signalling axis (PMID: 36384108, 22458710, 16221991). While we acknowledge that this approach bypasses some upstream physiological signalling events that initiate capacitation during sperm transit of the female reproductive tract, it is intentionally employed to provide a reproducible capacitation stimulus, specifically enabling us to discriminate phosphorylation changes associated with the attainment of sperm fertilization competence. This strategy also directly addresses a limitation of working with mouse spermatozoa in which these cells rapidly succumb to cell senescence/death within a matter of ~1-3 hours in an in vitro* setting. In our previous studies, we have noted that this time period is insufficient to achieve high levels of capacitation among populations of mouse spermatozoa, unless pharmacological agents (i.e. dibutyryl-cAMP + pentoxifylline) are supplemented to accelerate capacitation (PMID: 15252132). This is now a widely accepted paradigm in the field and one that enables us to deliver on our stated objective of assessing the phosphorylation status of fertilization competent spermatozoa, as opposed to those that are captured during early phases of the capacitation cascade.

      Importantly, our conclusion that capacitation is associated with comparatively fewer phosphoproteomic changes is based on direct quantitative comparison with epididymal maturation under identical analytical conditions, and is not intended to downplay the biological importance of this critical maturation event. Even under the capacitation-inducing conditions employed herein, the scale of phosphoproteomic remodelling observed was substantially smaller than that occurring during epididymal transit, underscoring the influence of epididymal maturation over the status of the sperm phosphoproteome.

      To address this concern, we have revised the manuscript to clarify that the capacitation-associated phosphoproteomic changes reported here are specific to the experimental model used and likely represent a conservative estimate of signalling complexity under physiological conditions. We have also tempered language implying generalisation beyond this context (please see lines 329 - 332, 488 - 491, 673 - 677).

      • *

      • Kinase prediction and functional validation require more rigor*

      The identification of 343 kinases that may regulate phospho-changes is extremely broad. Issues:

        • The kinase-substrate assignments rely heavily on in silico predictions (IPA, PhosphoSitePlus), which often contain non-sperm data. __Answer: __We thank the reviewer for this important observation and fully agree that kinase-substrate assignments inferred from in-silico* resources such as IPA and PhosphoSitePlus are largely derived from non-sperm systems and therefore must be interpreted cautiously.

      Importantly, this limitation reflects a broader and well-recognised gap in the field; regrettably comprehensive, experimentally validated kinase-substrate networks do not currently exist for mammalian spermatozoa on this scale, particularly in the context of epididymal maturation and capacitation. The primary objective of the present study was therefore not to define definitive kinase-substrate relationships, but to generate a high-depth, sperm-specific phosphoproteomic resource that can serve as a foundation for hypothesis generation and future mechanistic interrogation.

      Accordingly, in-silico kinase prediction tools were employed to contextualise the phosphoproteomic data and to prioritise candidate kinases for functional testing, rather than to assert sperm-specific kinase-substrate specificity. We have revised the manuscript to clarify that these predictions represent informed starting points in a system where such information is currently lacking, and that functional relevance was subsequently assessed using complementary pharmacological and genetic approaches (please see lines 383 - 388, 682 - 686).

      By providing a deep, stage-resolved phosphoproteomic dataset encompassing more than 14,000 phosphosites, this study establishes a much-needed reference framework for the reproductive biology community, enabling future targeted validation of kinase-substrate relationships in sperm. We believe this resource-based contribution represents a major strength of the work and addresses a critical knowledge gap in the field.

      • *

        • Please explain the rationale by which, from 343 candidate kinases, 3 (STK33, HIPK4, PAK1) are selected.*
        • The pharmacological inhibitors used have off-target effects (ML281 inhibits multiple CMGC kinases; Foretinib inhibits MET/VEGFR; NVS-PAK1-1 inhibits PAK1/2/3).*
        • No control experiments are included to confirm kinase inhibition in sperm (e.g., phosphosite-specific Western blots)* __Answer: __We split this comment from the above, to best address this important critique. We agree that kinase-substrate relationships inferred from phosphoproteomic data must be interpreted with caution. The identification of 343 kinases in this study was not intended to represent a definitive catalogue of all sperm-specific kinase-substrate interactions, but rather to provide insights into kinases that potentially contribute to phosphoproteomic remodelling of mouse spermatozoa during the different phases of their post-testicular maturation. These kinases were identified through integration of multiple complementary approaches, including direct detection within the phosphoproteome, upstream regulator prediction using IPA, curated kinase-substrate databases, and comparison with previously published epididymal sperm proteomes.

      From this broader resource, we deliberately restricted functional interrogation to a small subset of kinases putatively associated with capacitation-induced phosphoproteomic changes. STK33, HIPK4, and PAK1 were selected based on their predicted association with capacitation-specific phosphorylation events, representation across distinct kinase families, lack of prior functional characterisation in terms of either sperm maturation or function, and availability of well-characterised pharmacological inhibitors suitable for functional perturbation. We fully acknowledge that the inhibitors employed are not absolutely kinase-specific and may exhibit off-target effects. Accordingly, we have revised the manuscript to clarify that these experiments are intended to test functional dependence on kinase activity rather than to establish direct kinase-substrate relationships. The observation that combined inhibition of three mechanistically distinct kinases produced a robust and additive suppression of the acrosome reaction supports the conclusion that kinase activity is required for this process, while avoiding overinterpretation of individual kinase specificity.

      We have revised the language throughout the manuscript to more clearly reflect these limitations and to frame the kinase inhibition experiments as functional validation of phosphoproteomic predictions rather than definitive mechanistic proof (please see lines 383 - 388, 405 - 406, 682 - 686, 705 - 710).

      • *

      • *

      • The knockout-mouse validation section is underdeveloped*

      The linkage of KO phenotypes to phosphorylation changes is potentially powerful but currently weak.

      Issues:

        • Most KOs are systemic deletions, not sperm-specific; phenotypes could stem from developmental defects.*
        • Some proteins validated (e.g., ACO2, CMPK1) regulate core metabolism; their phenotypes may not reflect phosphoregulation but loss of essential protein function.*
        • No evidence is provided that the KO affects the specific phosphosites detected in the MS dataset.* __Answer: __We thank the reviewer for this important clarification. We agree that since the knockout models employed represent systemic deletions, they cannot directly resolve sperm-specific or phosphosite-specific mechanisms, and it was not our intention to suggest otherwise. We have revised the manuscript to explicitly frame the knockout phenotypes as evidence of physiological relevance of the identified phosphoproteins, rather than as direct validation of individual phosphorylation events (please see lines 723 - 734).

      We further clarify that for proteins with central metabolic roles, the observed phenotypes likely reflect loss of essential protein function rather than isolated disruption of phosphoregulation. Accordingly, we have tempered our language and emphasise that these data support functional importance while highlighting the need for future studies employing germ cell-specific or phosphosite-targeted models (please see discussion).

      • *

      • Immunofluorescence and Western blots need improved quantification*

      Figures showing PKA substrate, pY, pT, pS changes are visually compelling but lack:

        • quantification across biological replicates*
        • explanation of antibody specificity (e.g., pan-PKA sites include RRXS/T motifs; cross-reactivity possible). __Answer: __We thank the reviewer for this comment and appreciate the emphasis on rigor in antibody-based analyses. We would like to clarify that the immunofluorescence and immunoblot data presented in this study do include densitometric based quantification taking into account data generated from three independent biological replicates. This is indicated by the inclusion of error bars and as stated in the relevant figure captions. (please see lines 1164 - 1168 "All immunoblotting experiments were repeated with at least three biological replicates. Densitometric data normalization was performed against the loading control protein GAPDH, and each value subsequently expressed as a fold change relative to the caput sperm. Data were analyzed by one way ANOVA with GraphPad Prism.).*

      With respect to antibody specificity, we fully agree that phospho-specific and motif-based antibodies have inherent limitations, including epitope ambiguity and the inability to resolve site-specific phosphorylation with amino acid precision (please see our answer to Comment #2 from Reviewer #1 above). For this reason, the antibodies employed here represent well-established, widely used markers in sperm biology and were included to illustrate global phosphorylation trends rather than to validate individual phosphosites. Importantly, quantitative and site-resolved interpretation of phosphorylation dynamics throughout the manuscript is derived from mass spectrometry based phosphoproteomics, which provides substantially greater specificity and resolution than antibody-based approaches.

      • *

      • Many interpretations about metabolism, storage, oxidative stress, and quiescence are speculative*

      The discussion provides attractive models linking phosphorylation to:

        • suppression of glycolysis*
        • quiescent metabolic state in cauda epididymis*
        • activation of antioxidant pathways*
        • UPR and proteostasis modifications However, no direct functional evidence is provided for any of these pathways.*

      __Answer: __We thank the reviewer for this thoughtful observation and agree that several interpretations linking phosphorylation changes to metabolic regulation, oxidative stress, proteostasis, and cellular quiescence are necessarily inferential in the absence of direct functional assays. With this in mind, we have revised the Discussion to more clearly distinguish data-driven observations from hypothesis-generating interpretations and have tempered language accordingly. These models are now explicitly framed as conceptual frameworks arising from large-scale phosphoproteomic analysis, intended to guide future targeted investigation rather than to assert definitive mechanistic conclusions (please see lines 588, 590, 596 - 599, 609, 613, 621 - 625 ).

      Given the breadth and depth of the dataset, only a limited number of functional pathways could be explored experimentally within the scope of the current study. We anticipate that the phosphoproteomic resource generated here, supported by the accompanying ShinySpermPhospho application, will enable the wider community to interrogate additional pathways and to design focused mechanistic studies building on these findings.

      • Acrosome reaction evaluation*

      I have encountered significant deficiencies in this approach. On one side, testing the effect of a single dose of inhibitors on a specific readout is too preliminary, as stated above. In addition, and due to the presence of possible off target effects, more than one inhibitor is expected to be tested, or a direct biochemical assay to confirm at least targeted action. Even KO models, as proposed for other proteins in Figure 7.

      Acrosome reaction values are expected to be presented, as regularly done, by indicating acrosome reacted percentages, without normalizations that complicate understanding. In addition, consider Pg as a more physiological stimulus instead of A23187 for triggering AR.

      __Answer: __We thank the reviewer for these constructive comments and agree that careful assessment of inhibitor effects on sperm viability and motility is essential. We would like to clarify, in case this was overlooked, that these controls were performed and are presented in Supplementary Figure S4. Specifically, sperm were exposed to four concentrations of each inhibitor and assessed over time (0 and 60 minutes) for viability, total motility, and progressive motility. Across all concentrations, time points, and treatment conditions, including combined inhibitor treatments, no significant reductions in sperm viability or motility parameters were observed. These data support the conclusion that the effects on acrosome reaction are not secondary to general sperm toxicity.

      With respect to data presentation, acrosome reaction values were expressed relative to matched capacitated vehicle-treated controls to account for biological variability in absolute acrosome reaction rates observed between independent sperm preparations and experimental days. This normalisation strategy was used to facilitate direct comparison between treatment groups, and respectfully, we have elected to retain this presentation format in Figure 6. Nonetheless, in the interest of transparency, we have now included the raw acrosome reaction values/ranges in the supplementary material (Table S6) and have provide these as a figure to the reviewer for reference.

      We agree that the use of progesterone represents a more physiological stimulus for inducing the acrosome reaction. However, there is no single universally accepted approach for acrosome reaction induction, and calcium ionophore-based assays remain widely used to assess the capacity for acrosomal exocytosis under defined experimental conditions. In the present study, this approach was selected to provide a robust and reproducible functional readout suitable for comparative analysis.

      We have revised the manuscript to more clearly describe the dose-response and viability control experiments, to acknowledge potential off-target effects of kinase inhibitors, and framed the acrosome reaction assays as functional screening experiments rather than definitive mechanistic dissection (please see lines 415, 705 - 710, Table S7).

      • *

      • Figure legend to Figure 7. How many oocytes, how many replicates were performed. How many transfers. Please add important data to the legend.*

      __Answer: __We thank the reviewer for highlighting this omission. In line with comment 4 from Reviewer #1 (please see above), we have revised the Figure 7 legend and Methods section to provide detailed information regarding sample size and experimental design, including the number of oocytes used per IVF experiment performed and the number of biological replicates.

      Specifically, IVF and oocyte isolation procedures were conducted according to standardised INFRAFRONTIER protocols (PMID: 25414328, 27262858, 38839949). Across knockout lines, IVF experiments were performed over 1 - 6 independent cycles per line, with an average of 24.2 oocytes used per cycle. To provide transparency regarding this variability, we have included a new supplementary figure (Figure S6) summarising the average number of oocytes used per IVF cycle alongside the corresponding cleavage rates (%CR).

      Sperm samples used in these assays were derived from cryopreserved cauda epididymal spermatozoa pooled from 10 heterozygous males per knockout line, as per EMMA guidelines (PMID: 17709347). Additionally , where available, we have incorporated reproductive outcome measures at the level of individual litters (e.g. average pup number and sex distribution) to provide further biological context. These additions improve transparency and ensure that the experimental design and data interpretation are clearly defined (please see lines 453 - 457, 1228 - 1234, 1431 - 1444, 1533 - 1536, 1572 - 1575, Figure 7 & S6).

      • *

      This is technically sophisticated phosphoproteomic study of mouse sperm maturation across the epididymis and during capacitation. The dataset is deep (>14,000 phosphosites) and the analyses integrate high-resolution MS, immunofluorescence, IPA, kinase mapping, pharmacological inhibition, and knockout mouse models.

      __Answer: __We thank the reviewer for this positive assessment and for recognising the technical sophistication and integrative nature of the study. We are also grateful for the constructive feedback provided, which has helped us to substantially strengthen the clarity, rigor, and presentation of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is technically sophisticated phosphoproteomic study of mouse sperm maturation across the epididymis and during capacitation. The dataset is deep (>14,000 phosphosites) and the analyses integrate high-resolution MS, immunofluorescence, IPA, kinase mapping, pharmacological inhibition, and knockout mouse models. The manuscript represents a nice resource for the field. However, several issues limit clarity, mechanistic interpretation, and robustness of the conclusions. In particular, the manuscript's scope is extremely large, making some conclusions insufficiently supported, and some analyses require better control, methodological transparency, or deeper mechanistic connection. It gives the impression that some mechanistic data was added to descriptive data in order to increase the manuscript's impact, although the current mechanistic data is not convincing.

      Major concerns

      1. Conceptual Overreach - "Epididymal maturation accounts for 86% of phosphorylation changes" The manuscript repeatedly emphasizes that epididymal maturation causes the majority of phosphoproteomic remodeling. While the data indeed show large quantitative differences, several conceptual issues remain:
        • The caput vs. cauda comparison includes differences in protein abundance, not only phosphorylation.
        • Many phosphosites lost in the cauda may reflect protein loss, not dephosphorylation (the authors acknowledge this, but quantitative controls are insufficient).
        • The normalization method for phosphopeptide abundance vs total protein abundance is needed.
        • It is unclear whether the same amount of starting material and equal protein loading were used across stages.

      I wwould suggest to perform (or explicitly describe) normalization using matched proteome intensities. Provide supplementary plots showing phosphosite/parent-protein normalization to avoid overinterpreting phosphosite loss as dephosphorylation. 2. Capacitation analysis is underpowered and oversimplified The authors state that capacitation leads to "modest" changes. However: - The capacitation protocol uses dibutyryl-cAMP + pentoxifylline, which may bypass early physiological signaling. This is a important red flag 3. Kinase prediction and functional validation require more rigor The identification of 343 kinases that may regulate phospho-changes is extremely broad. Issues: - The kinase-substrate assignments rely heavily on in silico predictions (IPA, PhosphoSitePlus), which often contain non-sperm data. - Please explain the rationale by which, from 343 candidate kinases, 3 (STK33, HIPK4, PAK1) are selected. - The pharmacological inhibitors used have off-target effects (ML281 inhibits multiple CMGC kinases; Foretinib inhibits MET/VEGFR; NVS-PAK1-1 inhibits PAK1/2/3). - No control experiments are included to confirm kinase inhibition in sperm (e.g., phosphosite-specific Western blots). 4. The knockout-mouse validation section is underdeveloped The linkage of KO phenotypes to phosphorylation changes is potentially powerful but currently weak. Issues: - Most KOs are systemic deletions, not sperm-specific; phenotypes could stem from developmental defects. - Some proteins validated (e.g., ACO2, CMPK1) regulate core metabolism; their phenotypes may not reflect phosphoregulation but loss of essential protein function. - No evidence is provided that the KO affects the specific phosphosites detected in the MS dataset. 5. Immunofluorescence and Western blots need improved quantification Figures showing PKA substrate, pY, pT, pS changes are visually compelling but lack: - quantification across biological replicates - explanation of antibody specificity (e.g., pan-PKA sites include RRXS/T motifs; cross-reactivity possible). 6. Many interpretations about metabolism, storage, oxidative stress, and quiescence are speculative The discussion provides attractive models linking phosphorylation to: - suppression of glycolysis - quiescent metabolic state in cauda epididymis - activation of antioxidant pathways - UPR and proteostasis modifications However, no direct functional evidence is provided for any of these pathways. 7. Acrosome reaction evaluation I have encountered significant deficiencies in this approach. On one side, testing the effect of a single dose of inhibitors on a specific readout is too preliminary, as stated above. In addition, and due to the presence of possible off target effects, more than one inhibitor is expected to be tested, or a direct biochemical assay to confirm at least targeted action. Even KO models, as proposed for other proteins in Figure 7. Acrosome reaction values are expected to be presented, as regularly done, by indicating acrosome reacted percentages, without normalizations that complicate understanding. In addition, consider Pg as a more physiological stimulus instead of A23187 for triggering AR. 8. Figure legend to Figure 7. How many oocytes, how many replicates were performed. How many transfers. Please add important data to the legend.

      Significance

      This is technically sophisticated phosphoproteomic study of mouse sperm maturation across the epididymis and during capacitation. The dataset is deep (>14,000 phosphosites) and the analyses integrate high-resolution MS, immunofluorescence, IPA, kinase mapping, pharmacological inhibition, and knockout mouse models.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors examine dynamic modifications of the sperm phosphoproteome during epididymal transit and capacitation. They compare three distinct populations differing in anatomical localization and activation status: caput sperm, non‑capacitated cauda sperm, and capacitated cauda sperm. Using high‑resolution tandem mass spectrometry, they reveal that phosphorylation changes during epididymal passage are far more extensive than previously appreciated. These findings are further validated in genetically modified animal models, where disruption of selected genes encoding for phosphoproteins results in marked defects in sperm motility and fertilization capacity.

      Major points:

      Throughout the text, and particularly in the paragraph entitled 'Epididymal maturation accounts for the majority of maturation‑associated sperm cell signaling,' it seems that phosphorylation is interpreted as inherently activatory and dephosphorylation as inhibitory (lines 248-252). Since this relationship is not universally applicable, it would be valuable to address this issue at the outset of the paragraph and to discuss how phosphorylation events are context‑dependent in their effects on protein function.

      Lines 392-393: the claim that "the introduction of each inhibitor to populations of capacitating spermatozoa led to a significant reduction..." is not fully supported by data and should be toned down. In fact, two out of three inhibitors, do not significantly affect the acrosome reaction.

      The discussion section, spanning 11 pages, is overly long and contains considerable repetition. I recommend transferring the detailed description of experiments to the 'Results' section and using the discussion primarily to synthesize and highlight the novel findings while limiting speculative content. For example, the content in lines 509-530 could be condensed and relocated to the Results. Likewise, other detailed examples would be more appropriately presented within their respective result paragraphs.

      Minor points:

      • To improve reproducibility, the suppliers of all reagents should be specified together with their catalogue numbers.
      • Figure 7: it is unclear which data are statistically significant
      • Figure 7B: fertilization capacity should be assessed at an earlier stage, as the cleavage rate to 2-cell embryo may be affected by factors unrelated to the sperm ability to fertilize

      Significance

      A key novel finding of this work is that extensive changes in the sperm phosphoproteome occur during epididymal maturation, whereas capacitation is associated with comparatively modest modifications. This research provides a finely resolved description of phosphorylation events associated with the signaling pathways underlying functional sperm maturation. The methodological innovation -high‑resolution MS‑based phosphoproteomics- unlocks a level of detail and comprehensiveness in phosphorylation analysis that was previously unattainable. Moreover, the identification of previously unrecognized phosphoproteins in sperm cells, together with the development of a dedicated application hosting the complete dataset, represents a valuable resource for researchers in reproductive biology and particularly for experts in sperm development and maturation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Skerrett-Byrne and collaborators represents a comprehensive and technically sophisticated phosphoproteomics study. Using high-resolution mass spectrometry on mouse sperm obtained from the caput and cauda regions of the epididymis, both before and after capacitation, the authors generated a more complete database of phosphorylation changes in these cells. One of the most interesting outcomes is that most of these changes occur during sperm maturation, rather than sperm capacitation. The work is important and relevant, and the information obtained could be valuable for reproductive biologists working in basic science, as well as for the identification of novel contraceptive targets.

      Minor comments:

      1) The title should include a reference to sperm capacitation, as most of the study focuses on comparisons between epididymal maturation and capacitation, and the functional experiments are based on the latter.

      2) Considering the newly reported changes in phosphosites, it would be desirable to include validation at the individual protein level for at least a few examples, using an independent technique such as western blotting.

      3) In the knockout models, it is not possible to distinguish between defects in spermatogenesis and those arising during maturation or capacitation. A parameter directly related to spermatogenesis should therefore be included, for example, testicular weight or histology, sperm number, and sperm morphology.

      4) Error values, sample size, and statistical analyses are missing from Figure 7 and should be provided for clarity.

      Significance

      This study shows by high-resolution phosphoproteomics that most phosphorylation changes occur during epididymal transit rather than capacitation, challenging long-standing assumptions. The integration of the new datasets with functional validation of key kinases and knockout models strengthens the study; however, the work lacks single-protein validation of phosphorylation events, and the use of systemic knockouts does not allow confirmation of sperm-specific effects. The open ShinySpermPhospho dataset will be worthwhile to a broad audience of reproductive biologists and cell signaling specialists, and may be of value for future studies on male fertility and the development of novel male contraceptives.

      Field of expertise: reproductive physiology, sperm biology and capacitation, gamete interaction.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers and we are glad that they acknowledge this work to be a timely contribution to a quickly moving field and a valuable tool to generate testable hypothesis. We are pleased that reviewer #2 highlights that “a major strength is the combination of orthogonal evidence types” and that the tool serves to generate novel hypothesis. The revised manuscript will sharpen the positioning of the study within this context. Additional experimental evidence will be provided to address the points raised by reviewers #1 and #3.

      Reviewer #1* 1.The authors do not co-IP ARF1. This does not surprise me as small GTPases often hydrolyse their GTP during lysis. *

      We agree that this is likely due to transient association and GTP hydrolysis during lysis and will add a section to the manuscript.

      There have been a number of ARF1 bioID screens done- have the authors checked if their complex has turned up here?

      We will include this in the revised manuscript.

      1. I am a bit confused by some of the interpretation about KO and loss of JTB staining. They interpret: "The SYS1 acts as a Golgi recruitment factor for both ARFRP1 and JTB". The ARFRP1 has been published and is a cytosolic protein, so that makes sense. However, the JTB is not cytosolic by a membrane protein, so cannot be "recruited". Now maybe it is retained in the Golgi by this interaction, but if that is the case you would still expect signal on another organelle or the plasma membrane (and we see it isnt degraded in the lysosome due to the western blot). I am confused by the authors model here.

      We will clarify the phrasing and will provide a clearer interpretation, also considering the other improved imaging experiments that will be included in the revised manuscript.

      4.The authors validate their JTB antibody and confirm the fact that there are not reduced SYS1 levels in the JTBKO- this is very clear (albeit unquantified). What I do not see validated is the SYS1KO. I think this is quite important.

      We will validate SYS1 KO using TIDE and/or western blotting.

      5.The colocalisation in panel 3D is weak and unclear to me. It is not quantified. It is not clear if there have been 3 repeats.

      The revised manuscript will include improved imaging data. We will repeat relevant experiments, include appropriate controls and quantify where necessary.

      6.The imaging in figure 3 is not clear in places, and it stands out in a very clear manuscript. I cannot see the JTB in panel F. There are no scale bars. The dynamic range of the image is not utalised. I do not see the stain in the JTB in either of the sys1 KO, i do not see the SYS1-FLAG staining in the complement, and it is not quantified at all. It may all seem trivial, but (to me) this is an absolutely critical bit of biology data to support the informatics.

      The revised manuscript will include improved imaging data. We will repeat relevant experiments, include appropriate controls and quantify where necessary.

      7.I am a bit unconvinced by the interpretation of it being a retrograde trafficking complex. This is for 2 key reasons- 1) the VSV-G is antrograde (despite unusually they interpret a "severe defect in retrograde transport"). 2) Even if it was only having an effect in the retrograde direction I would still remain a little open minded about it as you can easily mistake trafficking of a protein in one direction for another if an unknown protein (SNARE for example) has defective trafficking.

      We used VSVG-KDEL in this assay. This setup specifically measures retrograde trafficking. We will clarify this in the revised manuscript. We will clarify in the Discussion that we confirmed a role in retrograde trafficking but cannot exclude a role in anterograde trafficking

      Reviewer #2

      Major comment: scope and interpretation of DepMap-derived functional evidence The manuscript could benefit from more clearly defining the scope of the functional evidence used to nominate complexes. The central co-dependency signal is derived from DepMap 24Q2 CRISPR gene-effect profiles, which are primarily cancer cell-line fitness/proliferation data. This is an important limitation because the resulting correlations may preferentially capture complexes or pathways that influence viability in proliferating cancer cells, while missing complexes active in differentiated, tissue-specific, stimulus-dependent, or non-proliferative contexts. Conversely, some correlations may reflect shared cancer-lineage or fitness dependencies rather than direct participation in a stable complex. The authors are appropriately cautious in stating that DepCom is not a complete inventory of human protein complexes, but the title, framing, and resource description could still be read as implying a more general catalogue of functional protein complexes. The authors might consider adding a clearer introduction to DepMap and explicitly discuss how the cancer-cell-line origin of the data affects interpretation of the 518 predicted complexes. This could be addressed without new experiments, for example by adding text early in the Results section explaining what the CRISPR gene-effect scores measure, and by expanding the Discussion to clarify that DepCom represents structurally plausible complexes prioritized by co-dependency across cancer cell lines, rather than an unbiased or context-independent map of human protein complexes. The selection of highlighted examples would also benefit from clearer justification. The peroxisome, actin, WNK/TSC22D2, and Golgi/JASS examples are biologically interesting, but the rationale for choosing them is not always explicit. Were they selected because they were novel, high-confidence, disease-associated, experimentally tractable, or representative of different resource categories? Briefly stating the selection criteria would help readers understand whether these examples are illustrative case studies or representative outcomes of the pipeline.

      We agree with the reviewers' assessment that this resource should be viewed as hypothesis-generating and that the overall framing should be improved. We will revise the manuscript at the appropriate sections, according to the more detailed comments of all reviewers.

      Minor comments

      1. Clarify post-clustering removal of large/problematic protein families and complexes. In the Methods, the authors state that "clusters of histones and keratin clusters, as well as the mito-ribosome, complexes of the electron transport chain and the mediator complex" were removed because of their large sizes. This filtering step would benefit from additional detail. Please specify the criteria used to define these removed clusters, how many clusters/proteins were removed at this stage, and whether removal was based only on size or also on biological/manual curation. It would also be helpful to explain why these proteins or clusters were removed after clustering rather than excluded before graph construction and clustering, since highly connected or compositionally biased protein families could potentially influence neighboring cluster assignments. If available, a brief robustness check showing that pre-removal of these proteins gives similar candidate complexes would strengthen confidence in the clustering procedure.

      We will add the requested information to the relevant section. Alongside the manuscript we will also provide lists of the complexes before and after every filtering step

      1. Clarify the rationale for excluding complexes larger than 5000 residues. The 5000-residue cutoff is understandable for AF3 computational cost, but the manuscript should briefly state how many candidate complexes were excluded by this cutoff and whether this preferentially removes known large assemblies. This would help readers understand the scope of complexes that DepCom is expected to miss.

      Alongside the manuscript we will now also provide lists of the complexes before and after every filtering step.

      1. Improve wording in the CAP1/CFL1/WDR1/ACTB example. The sentence "Additionally, CAP1 works in concert with CFL1 to accelerate depolymerisation, though if a four-protein complex consisting of actin, WDR1, CAP1 and CFL1 is relevant is not clear" is difficult to parse. Possible revision might be something like: "Additionally, CAP1 works in concert with CFL1 to accelerate depolymerisation, although it remains unclear whether actin, WDR1, CAP1 and CFL1 form a stable four-protein complex in cells." This more clearly separates known biology from the speculative interpretation of the DepCom prediction.

      Wording will be improved.

      1. Improve reproducibility details for AF3 predictions. The Methods state that predictions were run using a local AF3 installation, but reproducibility would be improved by reporting relevant AF3 settings, number of seeds/models per complex, whether templates were used, how disordered regions were handled, and whether predictions were repeated for all complexes or only selected examples. This is especially important because the manuscript notes that multiple predictions can yield different subunit arrangements.

      We will provide detailed settings in the methods section. Regarding disordered parts: All predictions used full length sequences (canonical UNIPROT ID) for each protein, so disordered residues are included. If disordered regions have low PLDDT and poor PAE, these regions will simply not score as interfaces in AlphaBridge. The one exception where we did crop structures is Figure 2D, but purely for visualization purposes, the full length complex did score in the pipeline (uncropped).

      Reviewer #3

      Co-essentiality is not the same as physical complex membership. This is the biggest conceptual concern. Genes in the same pathway are co-essential whether or not their products bind. The authors lean on the structural prediction step to filter this out, but that means the entire pipeline rests on AF3+AlphaBridge being correct about who interacts with whom. There is no independent benchmarking shown of how often AlphaBridge calls a true positive vs a false positive at the chosen 0.5 cutoff. Why 0.5? Where does that number come from? A short benchmarking section using known complexes (CORUM 5.0, hu.MAP 2.0, the PDB) would make the choice defensible. Right now it reads as arbitrary.

      We thank the reviewer for bringing up the need for such an important clarification. We fully agree that co-essentiality does not equal physical interaction and structure predictions are imperfect. This is precisely the logic underlying our pipeline design, not a limitation we overlooked. The two data sources are used sequentially and serve distinct roles: first, we construct protein sets that are connected through networks of predicted binary physical interactions; then we cluster these based on DepMap correlations, selecting likely physical complexes that display co-essentiality between their components.

      In other words, clustering on DepMap data alone would certainly return many spurious correlations: as the referee points out “Co-essentiality is not the same as physical complex membership”. Anchoring the search space with structural predictions substantially reduces this noise. Neither data source alone is sufficient, nor do we claim either is definitively "correct": the value lies in their combination. We hope improved phrasing in the revised manuscript will highlight this better.

      Regarding benchmarking AlphaBridge score: we have benchmarked AlphaBridge, in response to reviewer feedback on the original AlphaBridge paper (Structure, Cell Press). In the figure here it is clear that in our benchmark of PDB structures (with

      Comparison to existing resources is incomplete. I can't help but wonder what was found here that would not have been possible by analysing existing resources. CORUM 5.0 (7,193 mammalian complexes, ~71% human-derived; Tsitsiridis et al. 2024 NAR), hu.MAP 2.0 (Drew et al. 2021, ~6,965 complexes from >15,000 MS experiments), BioPlex 3.0 (Huttlin et al. 2021, 118,162 interactions in HEK293T), ad the Complex Portal already cover a large fraction of the human complexome. The authors compare to PDB, the original interactome paper, and Complex Portal, but they explicitly skip CORUM and hu.MAP, both of which are central reference resources in this space. Without including these, the "60 complexes unique to DepCom" number is not really meaningful. This needs to be redone properly.

      We will add the comparison with Corum and hu-MAP in the revision.

      Validation rate is one out of 518. The JASS work is solid, but a single experimentally validated complex out of 518 gives the reader essentially no estimate of how often the rest of the predictions are correct. Even a smaller systematic effort, say IP-MS on five to ten predicted novel complexes in the same cell line, would do an enormous amount to establish how trustworthy the resource is. The authors already have the V5/IP-MS pipeline running. Right now the manuscript implicitly asks the reader to trust 517 predictions on the strength of one validation.

      In this paper we validated one out of the 60 complexes we claim are new. Notably we provide new biological data and demonstrate how consulting our resource, or following the same logic of combining functional and structural information, can lead to new exciting discoveries. We note that out of the 518 complexes we list, 69 complexes are exactly mirrored in the PDB and/or Complex Portal, while for another 389 there is partial evidence. Thus, our dataset is amply validated, and at the same time contains data to enable new discoveries. We also note, that following the release of our resource eight months ago, a new high-impact publication “validated” a complex we have independently picked in DepMap (Oosterheert et al, Choreography of rapid actin filament by coronin, cofilin and AIP1, Cell, 2025). We will rephrase relevant sections (also in response to reviewer 2) to increase clarity about validation.

      The functional and disease clustering is potentially circular. GO terms and STRING associations are themselves derived in large part from the published literature on protein function, including text mining channels in STRING, much of which is downstream of complex membership. Of course complexes cluster into "DNA repair" and "vesicle trafficking" if you cluster on GO and STRING. The same applies to Open Targets, which integrates GWAS Catalog, ClinVar, literature mining, and other sources. The clustering is fine as a navigation aid for the website, but it is not, as currently presented, an independent validation of anything. I would tone the discussion down accordingly.

      We did not mean to present the clustering as an independent validation. We will tone down the discussion accordingly.

      AF3 limitations on this class of problem. AF3 itself acknowledges limitations (Abramson et al. 2024, including the December 2024 addendum), and subsequent benchmarking has flagged disordered regions, dynamic/large assemblies, and certain transmembrane systems as known weak points. The JASS complex is largely transmembrane, the WNK1-TSC22D2 example involves disorder-to-order transitions, and several flagship examples involve large multi-domain proteins. The authors acknowledge some of this in passing but should state explicitly which complexes were trimmed, how the trimming choices were made, and whether predictions were repeated with different seeds to check stability. Figure S4 is a good start, but for a resource paper a more systematic seed-stability analysis is warranted.

      No complexes were trimmed for the initial AF3 predictions. The WNK1-TSC22D2 example was trimmed and re-predicted only for visualization purposes. We apologize for the misunderstanding and will state this more clearly.

      AF3 certainly has limitations. Regarding disordered regions, these will almost always be assigned a poor pLDDT (also if AF3 wrongly folds them into helices). AlphaBridge will not pickup these low pLDDT regions as interfaces. Regarding dynamic assemblies, these might again lead to poor confidence scores and consequently these will not be picked up as interfaces by AlphaBridge. If AF3 confidence metrics are analyzed properly, the main concern for both disordered regions and dynamic assemblies is to miss true positive interactions, rather than finding false positive. As we did not aim to identify all possible human complexes, we consider focusing on the most confidently predicted interactions to be a fair trade off.

      While the JASS complex is indeed a membrane protein complex, the predictions are exceptionally confident across multiple seeds (we can provide predictions from multiple seeds for revision), and validates experimentally. Of course, structure predictions are no substitute for experimental structures, as cautioned multiple times throughout the manuscript.

      Figure S4 shows that despite the complex overall geometry being flexible, the interaction sites are predicted with high confidence across different poses. Since the aim of this study was to identify proteins interacting with each other, not accurate structures (which need to be solved experimentally), we argue that recomputing all structures with multiple seeds is disproportionately expensive computationally and would delay publication of a timely study while adding little.

      Statistics are thin in several places. On the Fisher exact test for Golgi/ER enrichment in V5-JTB IP-MS (Supplemental Table 1), an odds ratio of 2.77 is modest, and there is no comparison to a matched control IP. Is this more than expected by chance against an appropriate background? The IP-MS volcano plots show many significant proteins, but how was the background controlled? On the LLM section, no quantitative evaluation is presented at all and the assessment is admitted to be subjective.

      We will qualify the conclusions drawn from the IP-MS experiments. We maintain that together with the additional cell biology data, we build a compelling and convincing picture for this JASS complex.

      Experimentally, the background is controlled by measuring enrichment over WT cell lines that have undergone the same IP procedure as the V5-SYS1/JTB expressing cells (lysis, incubation with the anti-V5 conjugated beads, same wash procedure and sample processing), as is the standard in the field. We will clarify in the Methods section. Regarding identification, FDR rate was set to 1% at protein and peptide level and peptide spectrum matches (PSMs) were additionally filtered for SequestHT Xcorr score >1.

      We agree with the referee that the LLM interpretation is subjective and cannot be benchmarked. We suggest revising the resource and the paper, only providing structured LLM prompts to facilitate users asking the right questions, but we will not provide the LLM answers as part of the resource.

      The 4�ACTB speculation. The authors themselves note the AlphaBridge score declines from 0.9 (1�ACTB) to 0.78 (4�ACTB), yet they speculate about functional implications. This is exactly the kind of post-hoc rationalisation around weak evidence that should either be supported with experiment or removed. Either remove or qualify as speculative.

      We will qualify this as speculative

      The LLM-assisted analysis. I am genuinely uncomfortable with releasing 76 LLM-generated complex annotations as part of a published resource when the authors openly state these have "not been systematically validated". Putting these summaries on a website with the imprimatur of a peer-reviewed paper will lead to them being cited and reused. At minimum, the website needs prominent warnings on every page where an LLM summary appears, the prompts must be fully reproducible (not just downloadable as JSON), and a small validation table, say 10 complexes scored by a domain expert for accuracy of each claim, should be included as a supplemental figure. As it stands this section reads like an enthusiastic add-on that has not been thought through with the same care as the rest of the work.

      We thank the referee for bringing forward this consideration. We agree to remove the LLM answers for the 78 complexes from the manuscript and from the website, to ensure that the outputs cannot be cited. We will provide two different objective structure prompts for download to encourage variety in responses for curious users who want to explore. We will add a prominent disclaimer noting that responses resulting from these prompts cannot be interpreted as facts without validation.

      We cannot guarantee reproducibility with modern LLM inference architecture. Even if seeds are kept the same and temperature=0, floating-point non-determinism in GPU operations, distributed inference, and batch effects may lead to different results. Furthermore, models go through many different iterations rapidly. As a consequence, it is impossible for us to guarantee reproducibility

      Cutoffs and cluster numbers need stability analysis. The cutoff for the 75th-percentile DepMap correlation (mean of random + 3 SD = 0.147) is reasonable but should be accompanied by an FDR or precision/recall estimate against a labelled reference set. The choice of 20 final clusters in functional clustering (because that gave a peak in silhouette score) and 14 for disease clustering should also be supported by stability analysis, e.g. resampling.

      The 75th percentile cutoff is, in our opinion, well justified and sufficient for our purposes. FDR and precision recall need a set of true and false positives. The DepMap correlation clusters are an intermediate step in our pipeline and do not necessarily hold the final complexes. How can intermediate reference DepMap clusters be constructed and defined as true or false positives? Even if we would score clusters that contain a known complex as true positives, how to define false positives? If clusters do not contain a known complex, that does not necessarily mean that these proteins don’t interact, just that they have not been shown to interact yet.

      We will run resampling to improve confidence in the choice of cluster number.

      Internal numerical consistency. The bioRxiv preprint abstract refers to 354 high-confidence multi-protein complexes, while the body of the manuscript discusses 518 (224 dimers + 294 multimers). The relationship between these numbers should be stated explicitly. Likewise, the breakdown of "60 unique to DepCom" into 41 heterodimers + 19 multimeric should be reconcilable in the figures and tables. The number "9,764 unique seed proteins" should also be clarified to confirm it is the DepCom-internal seed set and not inherited from the Zhang et al. coverage or hu.MAP 2.0 (9,963 proteins). These are easy fixes but matter for a resource paper.

      BioRxiv preprint: The preprint that the reviewer read is an older version, which will be updated. .

      The 9,764 unique seed proteins is from the Zhang et al paper, and are the human proteins identified to confidently interact with at least one other human protein. We will make this more clear.

      Mander's overlap coefficient. The VSV-G(ts045)-KDELR retrograde-transport assay is well established and the experiment is clean, but MOC has been increasingly criticised in the colocalisation literature (Adler & Parmryd 2010, 2021). Best practice is to also report Manders' M1/M2 coefficients or Pearson's correlation alongside MOC. Adding these would be straightforward and would strengthen Fig 4B.

      We will improve co-localization measures where appropriate.

      Minor comments 1. Page 4: "candidate sets of potential multi-protein complex members". Pick one, they are either candidates or potential, not both.

      Will be addressed.

      Page 7: "Complex 294... mechanistic basis for CFL1 and WDR1 cooperation has only recently been described". Please update the reference list and language given how recent this is.

      Will be addressed.

      Page 7: JTB is described as "poorly characterised". This is a bit too strong. JTB has been studied in the context of TGF-β-induced mitochondrial regulation (Kanome et al. 2007), cytokinesis and chromosomal passenger complex association (Platica et al. 2011), the structural characterisation of its extracellular domain (Rousseau et al. 2012), and breast cancer biomarker work (Jayathirtha et al. 2022). A more accurate framing would be "incompletely characterised, with previously reported but functionally unresolved roles". The novelty here is the Golgi connection, which is genuine.

      We will rephrase.

      Page 8: the citation of Blomen et al. 2015 Science for "Golgi-related synthetic lethality" should be checked against the actual supplementary data of that paper to confirm the JTB attribution is correct.

      Will be check.

      Figure 1: as in many omics papers, please think of us colourblind readers. The pink-green DepMap correlation scale will be hard for some of us.

      The color scheme in use, alongside others, was tested with two colleagues that have different variants of colour blindness and was judged to be the best compromise.

      Figure 5A and 5B: 21 and 14 colour-coded clusters respectively in a single UMAP is too much. Consider splitting into separate panels by broad theme or providing an interactive version only.

      We will focus on a subsection, and provide the full interactive version on the homepage

      Page 11: "manually evaluated the quality of outputs". By whom, blinded to which model produced which output? Methods are silent on this.

      As stated above, we will remove the LLM part

      Some figures show "hairballs" with very limited informative content. Fig. 1B left panel and the AlphaBridge wheel plots in particular convey relatively little at the size shown.

      We will try and find a way to draw the AlphaBridge circular plots in better resolution; we do not however that the reviewer’s observation might be an artefact of the PDF file distributed to reviewers.

      The reference list looks a bit thin on prior systematic complexome efforts. BioPlex 3.0 (Huttlin et al. 2021 Cell), hu.MAP 2.0 (Drew et al. 2021 MSB) and CORUM 5.0 (Tsitsiridis et al. 2024 NAR) should all be cited and discussed.

      We will include the additional references where appropriate

      The discussion section drifts into general comments about AI in science that don't add much. I would cut about a third of it and use the space for a more careful framing of the actual contribution.

      We will shorten the discussion section and phrase more carefully.

      General assessment Reviewer #3: The strongest aspect of this study is the JASS complex story. The IP-MS, the SYS1-KO rescue experiment, the VSV-G(ts045)-KDELR transport assay, and the orthogonal CRISPR screens with diphtheria and Pseudomonas exotoxins together build a convincing case for JTB as a regulator of Golgi-to-ER retrograde trafficking. This part of the paper is genuinely nice work and would stand on its own. The pipeline itself, combining structural predictions with functional dependency data and filtering with AlphaBridge, is sensible and timely. It is a reasonable demonstration of how confidence filtering should be done at this kind of scale. The main limitations concern the resource framing. After reading the manuscript several times I am still trying to identify the central novel contribution beyond the JASS validation. The interactome predictions are taken from Zhang et al., DepMap is public, AF3 is public, AlphaBridge is the authors' own previously published tool, and GO/STRING/Open Targets/dbPTM are all public. The manuscript is essentially an integrative pipeline plus a website plus one experimentally followed-up complex. The framing oversells what is genuinely new. The authors' own comparison (Fig. S3) shows 60 complexes "unique to DepCom" out of 518, of which 41 are heterodimers and only 19 are multimeric. Nineteen genuinely novel multi-protein complexes is still a contribution but it is a long way from the 354/518 that the abstract and discussion implicitly emphasise. The validation rate (one of 518) and the missing comparisons to CORUM 5.0 and hu.MAP 2.0 are the two issues that most need addressing.

      We will rephrase these issue to adjust the framing. We would put forward that the main contribution of this manuscript is to present an integrative framework that combines data from orthogonal sources to highlight the possibility of structure prediction models to serve as a discovery tool. The reviewer identifies correctly (albeit derogatorily) that this is “essentially” an integrative pipeline. But it is an integrative pipeline that combines genetics and computational structure predictions in a novel (to the best of our knowledge) way and surfaces interesting new biology. The biology of the JASS complex goes well-beyond simple validation experiments, and we believe its discovery (based on our data) carries more value that the reviewer attributes to it.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Uckelmann and colleagues combine the recently published binary human interactome predictions from Zhang et al. (2025) with co-essentiality data from DepMap CRISPR screens to nominate sets of proteins that may form higher-order complexes. They cluster proteins around each "seed" using Leiden community detection on the DepMap correlation matrix, run AlphaFold3 on each candidate set, and apply AlphaBridge to retain only those interfaces predicted with confidence. After filtering they arrive at 518 complexes, of which 224 are dimers and 294 are larger assemblies (note: the abstract of the bioRxiv preprint refers to 354 high-confidence complexes, so the relationship between these numbers should be made explicit). They illustrate the resource with a few worked examples (PEX3/16/19/ACBD5, an actin-CFL1-WDR1-CAP1 assembly, a WNK1-NRBP1-TSC22D2 complex), and they experimentally validate one previously uncharacterised assembly that they name JASS (JTB-ARFRP1-SYS1), placing JTB at the Golgi and showing a role for it in Golgi-to-ER retrograde transport. They also provide a web portal (depcom.eu) with PTM mapping, GO/STRING-based functional clustering, Open Targets disease clustering, and LLM-generated executive summaries.

      Major comments:

      I am supportive of integrating orthogonal datasets in this kind of framework, but I am much less enthusiastic about how the analyses are carried through, and I think there are several issues that need adressing before this work is publishable.

      1. Co-essentiality is not the same as physical complex membership. This is the biggest conceptual concern. Genes in the same pathway are co-essential whether or not their products bind. The authors lean on the structural prediction step to filter this out, but that means the entire pipeline rests on AF3+AlphaBridge being correct about who interacts with whom. There is no independent benchmarking shown of how often AlphaBridge calls a true positive vs a false positive at the chosen 0.5 cutoff. Why 0.5? Where does that number come from? A short benchmarking section using known complexes (CORUM 5.0, hu.MAP 2.0, the PDB) would make the choice defensible. Right now it reads as arbitrary.
      2. Comparison to existing resources is incomplete. I can't help but wonder what was found here that would not have been possible by analysing existing resources. CORUM 5.0 (7,193 mammalian complexes, ~71% human-derived; Tsitsiridis et al. 2024 NAR), hu.MAP 2.0 (Drew et al. 2021, ~6,965 complexes from >15,000 MS experiments), BioPlex 3.0 (Huttlin et al. 2021, 118,162 interactions in HEK293T), and the Complex Portal already cover a large fraction of the human complexome. The authors compare to PDB, the original interactome paper, and Complex Portal, but they explicitly skip CORUM and hu.MAP, both of which are central reference resources in this space. Without including these, the "60 complexes unique to DepCom" number is not really meaningful. This needs to be redone properly.
      3. Validation rate is one out of 518. The JASS work is solid, but a single experimentally validated complex out of 518 gives the reader essentially no estimate of how often the rest of the predictions are correct. Even a smaller systematic effort, say IP-MS on five to ten predicted novel complexes in the same cell line, would do an enormous amount to establish how trustworthy the resource is. The authors already have the V5/IP-MS pipeline running. Right now the manuscript implicitly asks the reader to trust 517 predictions on the strength of one validation.
      4. The functional and disease clustering is potentially circular. GO terms and STRING associations are themselves derived in large part from the published literature on protein function, including text mining channels in STRING, much of which is downstream of complex membership. Of course complexes cluster into "DNA repair" and "vesicle trafficking" if you cluster on GO and STRING. The same applies to Open Targets, which integrates GWAS Catalog, ClinVar, literature mining, and other sources. The clustering is fine as a navigation aid for the website, but it is not, as currently presented, an independent validation of anything. I would tone the discussion down accordingly.
      5. AF3 limitations on this class of problem. AF3 itself acknowledges limitations (Abramson et al. 2024, including the December 2024 addendum), and subsequent benchmarking has flagged disordered regions, dynamic/large assemblies, and certain transmembrane systems as known weak points. The JASS complex is largely transmembrane, the WNK1-TSC22D2 example involves disorder-to-order transitions, and several flagship examples involve large multi-domain proteins. The authors acknowledge some of this in passing but should state explicitly which complexes were trimmed, how the trimming choices were made, and whether predictions were repeated with different seeds to check stability. Figure S4 is a good start, but for a resource paper a more systematic seed-stability analysis is warranted.
      6. Statistics are thin in several places. On the Fisher exact test for Golgi/ER enrichment in V5-JTB IP-MS (Supplemental Table 1), an odds ratio of 2.77 is modest, and there is no comparison to a matched control IP. Is this more than expected by chance against an appropriate background? The IP-MS volcano plots show many significant proteins, but how was the background controlled? On the LLM section, no quantitative evaluation is presented at all and the assessment is admitted to be subjective.
      7. The 4×ACTB speculation. The authors themselves note the AlphaBridge score declines from 0.9 (1×ACTB) to 0.78 (4×ACTB), yet they speculate about functional implications. This is exactly the kind of post-hoc rationalisation around weak evidence that should either be supported with experiment or removed. Either remove or qualify as speculative.
      8. The LLM-assisted analysis. I am genuinely uncomfortable with releasing 76 LLM-generated complex annotations as part of a published resource when the authors openly state these have "not been systematically validated". Putting these summaries on a website with the imprimatur of a peer-reviewed paper will lead to them being cited and reused. At minimum, the website needs prominent warnings on every page where an LLM summary appears, the prompts must be fully reproducible (not just downloadable as JSON), and a small validation table, say 10 complexes scored by a domain expert for accuracy of each claim, should be included as a supplemental figure. As it stands this section reads like an enthusiastic add-on that has not been thought through with the same care as the rest of the work.
      9. Cutoffs and cluster numbers need stability analysis. The cutoff for the 75th-percentile DepMap correlation (mean of random + 3 SD = 0.147) is reasonable but should be accompanied by an FDR or precision/recall estimate against a labelled reference set. The choice of 20 final clusters in functional clustering (because that gave a peak in silhouette score) and 14 for disease clustering should also be supported by stability analysis, e.g. resampling.
      10. Internal numerical consistency. The bioRxiv preprint abstract refers to 354 high-confidence multi-protein complexes, while the body of the manuscript discusses 518 (224 dimers + 294 multimers). The relationship between these numbers should be stated explicitly. Likewise, the breakdown of "60 unique to DepCom" into 41 heterodimers + 19 multimeric should be reconcilable in the figures and tables. The number "9,764 unique seed proteins" should also be clarified to confirm it is the DepCom-internal seed set and not inherited from the Zhang et al. coverage or hu.MAP 2.0 (9,963 proteins). These are easy fixes but matter for a resource paper.
      11. Mander's overlap coefficient. The VSV-G(ts045)-KDELR retrograde-transport assay is well established and the experiment is clean, but MOC has been increasingly criticised in the colocalisation literature (Adler & Parmryd 2010, 2021). Best practice is to also report Manders' M1/M2 coefficients or Pearson's correlation alongside MOC. Adding these would be straightforward and would strengthen Fig 4B.

      Minor comments

      1. Page 4: "candidate sets of potential multi-protein complex members". Pick one, they are either candidates or potential, not both.
      2. Page 7: "Complex 294... mechanistic basis for CFL1 and WDR1 cooperation has only recently been described". Please update the reference list and language given how recent this is.
      3. Page 7: JTB is described as "poorly characterised". This is a bit too strong. JTB has been studied in the context of TGF-β-induced mitochondrial regulation (Kanome et al. 2007), cytokinesis and chromosomal passenger complex association (Platica et al. 2011), the structural characterisation of its extracellular domain (Rousseau et al. 2012), and breast cancer biomarker work (Jayathirtha et al. 2022). A more accurate framing would be "incompletely characterised, with previously reported but functionally unresolved roles". The novelty here is the Golgi connection, which is genuine.
      4. Page 8: the citation of Blomen et al. 2015 Science for "Golgi-related synthetic lethality" should be checked against the actual supplementary data of that paper to confirm the JTB attribution is correct.
      5. Figure 1: as in many omics papers, please think of us colourblind readers. The pink-green DepMap correlation scale will be hard for some of us.
      6. Figure 5A and 5B: 21 and 14 colour-coded clusters respectively in a single UMAP is too much. Consider splitting into separate panels by broad theme or providing an interactive version only.
      7. Page 11: "manually evaluated the quality of outputs". By whom, blinded to which model produced which output? Methods are silent on this.
      8. Some figures show "hairballs" with very limited informative content. Fig. 1B left panel and the AlphaBridge wheel plots in particular convey relatively little at the size shown.
      9. The reference list looks a bit thin on prior systematic complexome efforts. BioPlex 3.0 (Huttlin et al. 2021 Cell), hu.MAP 2.0 (Drew et al. 2021 MSB) and CORUM 5.0 (Tsitsiridis et al. 2024 NAR) should all be cited and discussed.
      10. The discussion section drifts into general comments about AI in science that don't add much. I would cut about a third of it and use the space for a more careful framing of the actual contribution.

      Significance

      General assessment:

      The strongest aspect of this study is the JASS complex story. The IP-MS, the SYS1-KO rescue experiment, the VSV-G(ts045)-KDELR transport assay, and the orthogonal CRISPR screens with diphtheria and Pseudomonas exotoxins together build a convincing case for JTB as a regulator of Golgi-to-ER retrograde trafficking. This part of the paper is genuinely nice work and would stand on its own. The pipeline itself, combining structural predictions with functional dependency data and filtering with AlphaBridge, is sensible and timely. It is a reasonable demonstration of how confidence filtering should be done at this kind of scale.

      The main limitations concern the resource framing. After reading the manuscript several times I am still trying to identify the central novel contribution beyond the JASS validation. The interactome predictions are taken from Zhang et al., DepMap is public, AF3 is public, AlphaBridge is the authors' own previously published tool, and GO/STRING/Open Targets/dbPTM are all public. The manuscript is essentially an integrative pipeline plus a website plus one experimentally followed-up complex. The framing oversells what is genuinely new. The authors' own comparison (Fig. S3) shows 60 complexes "unique to DepCom" out of 518, of which 41 are heterodimers and only 19 are multimeric. Nineteen genuinely novel multi-protein complexes is still a contribution but it is a long way from the 354/518 that the abstract and discussion implicitly emphasise. The validation rate (one of 518) and the missing comparisons to CORUM 5.0 and hu.MAP 2.0 are the two issues that most need addressing.

      Advance:

      The advance is incremental rather than conceptual. The idea of intersecting co-essentiality with structural predictions is sensible but not new in spirit, and similar hybrid approaches are now becoming more common in this space (see e.g. EndoMAP.v1, Gonzalez-Lozano et al. 2025 Nature, which the authors do cite). What is new here is the specific implementation, the AlphaBridge filtering layer, and the JASS finding. The technical advance lies in the AlphaBridge filtering step on top of AF3 at a reasonably large scale. The biological advance is the JASS complex and the demonstration that JTB plays a role in Golgi-to-ER retrograde transport, which is genuinely new and well supported.

      Audience:

      This work will be of interest mainly to specialised audiences in structural proteomics, computational biology of protein complexes, and the protein-protein interaction community. The JASS finding will be of interest to a broader readership in cell biology, particularly those working on Golgi trafficking, ARF/ARL family GTPases, and retrograde transport. The web resource will likely find users among researchers studying specific complexes who want a quick structural hypothesis. I do not think the work, in its current form, will reach broad audiences in the way the authors hope, but a more sober framing would actually help it land better in the specialist community where it belong.

      My expertise:

      Mass spectrometry-based proteomics, protein-protein interaction mapping, systems biology, structural biology. I have working knowledge but not deep expertise in: structural prediction confidence metrics (AF3, AlphaBridge implementation details), DepMap CRISPR co-essentiality analysis, and Golgi cell biology. I would defer to a computational structural biology or cell biology specialist on the AF3 confidence interpretation details and on the cell biology specifics of the JASS validation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The study presents DepCom as a broad resource for discovering human multi-protein complexes by integrating predicted binary interactions, DepMap co-dependency, AF3 modelling, and AlphaBridge filtering. Overall, the computational strategy is well motivated, and the experimental validation of the JTB/SYS1/ARFRP1 complex provides a compelling example of how the resource can generate testable biological hypotheses.

      Major comment: scope and interpretation of DepMap-derived functional evidence

      The manuscript could benefit from more clearly defining the scope of the functional evidence used to nominate complexes. The central co-dependency signal is derived from DepMap 24Q2 CRISPR gene-effect profiles, which are primarily cancer cell-line fitness/proliferation data. This is an important limitation because the resulting correlations may preferentially capture complexes or pathways that influence viability in proliferating cancer cells, while missing complexes active in differentiated, tissue-specific, stimulus-dependent, or non-proliferative contexts. Conversely, some correlations may reflect shared cancer-lineage or fitness dependencies rather than direct participation in a stable complex. The authors are appropriately cautious in stating that DepCom is not a complete inventory of human protein complexes, but the title, framing, and resource description could still be read as implying a more general catalogue of functional protein complexes. The authors might consider adding a clearer introduction to DepMap and explicitly discuss how the cancer-cell-line origin of the data affects interpretation of the 518 predicted complexes. This could be addressed without new experiments, for example by adding text early in the Results section explaining what the CRISPR gene-effect scores measure, and by expanding the Discussion to clarify that DepCom represents structurally plausible complexes prioritized by co-dependency across cancer cell lines, rather than an unbiased or context-independent map of human protein complexes. The selection of highlighted examples would also benefit from clearer justification. The peroxisome, actin, WNK/TSC22D2, and Golgi/JASS examples are biologically interesting, but the rationale for choosing them is not always explicit. Were they selected because they were novel, high-confidence, disease-associated, experimentally tractable, or representative of different resource categories? Briefly stating the selection criteria would help readers understand whether these examples are illustrative case studies or representative outcomes of the pipeline.

      Minor comments

      1. Clarify post-clustering removal of large/problematic protein families and complexes.

      In the Methods, the authors state that "clusters of histones and keratin clusters, as well as the mito-ribosome, complexes of the electron transport chain and the mediator complex" were removed because of their large sizes. This filtering step would benefit from additional detail. Please specify the criteria used to define these removed clusters, how many clusters/proteins were removed at this stage, and whether removal was based only on size or also on biological/manual curation. It would also be helpful to explain why these proteins or clusters were removed after clustering rather than excluded before graph construction and clustering, since highly connected or compositionally biased protein families could potentially influence neighboring cluster assignments. If available, a brief robustness check showing that pre-removal of these proteins gives similar candidate complexes would strengthen confidence in the clustering procedure. 2. Clarify the rationale for excluding complexes larger than 5000 residues.

      The 5000-residue cutoff is understandable for AF3 computational cost, but the manuscript should briefly state how many candidate complexes were excluded by this cutoff and whether this preferentially removes known large assemblies. This would help readers understand the scope of complexes that DepCom is expected to miss. 3. Improve wording in the CAP1/CFL1/WDR1/ACTB example.

      The sentence "Additionally, CAP1 works in concert with CFL1 to accelerate depolymerisation, though if a four-protein complex consisting of actin, WDR1, CAP1 and CFL1 is relevant is not clear" is difficult to parse. Possible revision might be something like: "Additionally, CAP1 works in concert with CFL1 to accelerate depolymerisation, although it remains unclear whether actin, WDR1, CAP1 and CFL1 form a stable four-protein complex in cells." This more clearly separates known biology from the speculative interpretation of the DepCom prediction. 4. Improve reproducibility details for AF3 predictions.

      The Methods state that predictions were run using a local AF3 installation, but reproducibility would be improved by reporting relevant AF3 settings, number of seeds/models per complex, whether templates were used, how disordered regions were handled, and whether predictions were repeated for all complexes or only selected examples. This is especially important because the manuscript notes that multiple predictions can yield different subunit arrangements.-

      Significance

      General assessment:

      This study presents a timely and useful resource for prioritizing candidate human protein complexes by integrating predicted binary protein-protein interactions, DepMap co-dependency profiles, AlphaFold3 structure prediction, and AlphaBridge confidence filtering. A major strength is the combination of orthogonal evidence types: physical interaction predictions define a tractable search space, functional co-dependency helps identify coherent protein groups, and structure-confidence metrics provide an additional filter on the resulting candidates. The experimental validation of the JTB/SYS1/ARFRP1 complex is also a strong aspect of the study, as it demonstrates that the resource can generate biologically meaningful and experimentally testable hypotheses.

      The main limitation is that the resource should be interpreted as a prioritized, hypothesis-generating dataset rather than a comprehensive or context-independent catalogue of human protein complexes. As noted above, the DepMap-derived signal reflects cancer cell-line fitness/proliferation dependencies, and the final complex set is also shaped by the starting interactome, filtering choices, and computational constraints on complex size. These limitations do not undermine the utility of the resource, but they should be clearly framed for readers.

      One aspect that could further increase the impact and usability of the study is the DepCom web resource. The searchable table of complexes is already useful, particularly for users who want to query by gene or protein name. However, the website also presents functional and disease-based clustering, and many users may want to search or filter complexes by biological process, GO term, pathway, disease association, or disease cluster. Adding GO-term and disease-association fields to the main table, and allowing users to search/filter by these annotations, would make the resource more discoverable and useful to researchers approaching the dataset from a biological process or disease area rather than from a specific gene.

      Advance:

      The advance is primarily technical and resource-oriented, with an accompanying functional biological demonstration. The study helps fill a gap between large-scale binary interaction prediction and the more difficult problem of nominating higher-order assemblies. By using functional dependency profiles to prioritize multi-protein combinations before structure prediction, the authors reduce an otherwise intractable search space and generate a set of structurally plausible candidate complexes. The JASS complex and the proposed role of JTB in Golgi-to-ER retrograde trafficking provide a compelling example of biological discovery enabled by the pipeline.

      The broader DepCom resource, including predicted complex structures, AlphaBridge interface-confidence information, PTM-interface mapping, functional/disease clustering, and downloadable LLM prompts, should provide useful starting points for follow-up studies. These outputs are best viewed as hypothesis-generating rather than definitive biological annotation, but they represent a valuable extension of existing protein-interaction and structure-prediction resources.

      Audience:

      The study will likely interest a broad basic-research audience, especially researchers in protein complex biology, structural biology, functional genomics, systems biology, cancer dependency mapping, cell biology, and computational biology. It may also be useful to investigators studying specific pathways or poorly characterized proteins, since the resource provides candidate interaction partners and structural hypotheses that can guide experiments. The translational relevance is more indirect, mainly through disease-association clustering and potential target-discovery applications, but the immediate audience is likely to be basic and computational researchers.

      My expertise is in computational protein databases, protein domain classification, structural/evolutionary analysis of proteins, and functional annotation resources, including experience with the ECOD database for evolutionary classification of protein domains. I am less able to evaluate the fine details the experimental cell-biology assays beyond their general interpretation and reporting.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Characterising protein complexes is a fundamental goal in modern molecular cell biology. Here, Uckelmann and colleagues have presented a solution to part of this problem. By combining functional clustering with alphafold modelling, they present a high throughput bioinformatic solution. The paper and figures are exceptionally clear and well presented. The conclusions are reasonable, and the data interesting. I am a cell biologist with expertise in molecular machinery of trafficking, so the focus of my review will be on the identification of a new complex, that is proposed to have a role in retrograde trafficking. On the whole I find this a interesting and convincing finding. However I have some comments and questions that I hope may help the authors. I will naturally focus my comments on the cell biology.

      1.The authors do not co-IP ARF1. This does not surprise me as small GTPases often hydrolyse their GTP during lysis. 2.There have been a number of ARF1 bioID screens done- have the authors checked if their complex has turned up here? 3.I am a bit confused by some of the interpretation about KO and loss of JTB staining. They interpret: "The SYS1 acts as a Golgi recruitment factor for both ARFRP1 and JTB". The ARFRP1 has been published and is a cytosolic protein, so that makes sense. However, the JTB is not cytosolic by a membrane protein, so cannot be "recruited". Now maybe it is retained in the Golgi by this interaction, but if that is the case you would still expect signal on another organelle or the plasma membrane (and we see it isnt degraded in the lysosome due to the western blot). I am confused by the authors model here. 4.The authors validate their JTB antibody and confirm the fact that there are not reduced SYS1 levels in the JTBKO- this is very clear (albeit unquantified). What I do not see validated is the SYS1KO. I think this is quite important. 5.The colocalisation in panel 3D is weak and unclear to me. It is not quantified. It is not clear if there have been 3 repeats. 6.The imaging in figure 3 is not clear in places, and it stands out in a very clear manuscript. I cannot see the JTB in panel F. There are no scale bars. The dynamic range of the image is not utalised. I do not see the stain in the JTB in either of the sys1 KO, i do not see the SYS1-FLAG staining in the complement, and it is not quantified at all. It may all seem trivial, but (to me) this is an absolutely critical bit of biology data to support the informatics. 7.I am a bit unconvinced by the interpretation of it being a retrograde trafficking complex. This is for 2 key reasons- 1) the VSV-G is antrograde (despite unusually they interpret a "severe defect in retrograde transport"). 2) Even if it was only having an effect in the retrograde direction I would still remain a little open minded about it as you can easily mistake trafficking of a protein in one direction for another if an unknown protein (SNARE for example) has defective trafficking.

      Significance

      Characterising protein complexes is a fundamental goal in modern molecular cell biology. Here, Uckelmann and colleagues have presented a solution to part of this problem. By combining functional clustering with alphafold modelling, they present a high throughput bioinformatic solution. The paper and figures are exceptionally clear and well presented. The conclusions are reasonable, and the data interesting. I am a cell biologist with expertise in molecular machinery of trafficking, so the focus of my review will be on the identification of a new complex, that is proposed to have a role in retrograde trafficking. On the whole I find this a interesting and convincing finding.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary: This manuscript has presented a high-throughput fluorescence recovery after photobleaching (HiT-FRAP) platform to screen genes affecting the dynamics of the nucleolar scaffold nucleophosmin (NPM1). The platform included the siRNA-based screening of 65 RNA helicases, 9 phylogenetically related helicase pairs, and 290 ribosomal proteins along with selected assembly factors. These factors were classified as those accelerating or decelerating NPM1 dynamics based on the t1/2 measurements. Combined with nucleolar morphological changes, the authors identified that depletion of early-stage (A-F) and later-stage (G-H) LSU assembly factors resulted in different nucleolar phenotypes, suggesting the pre-ribosome assembly can impact nucleolar morphology. Further exploring the potential mechanis m suggested that the NPM1's intrinsically disordered region (IDR) contributed to the nucleolar organization and dynamics.

      Together, this well-designed study uncovered that the ribosome assembly, both the early and late ribosomal precursors can influence biophysical properties of the nucleolus. Below please find our concerns for the authors to consider to strengthen the major conclusions.

      Major comments:

      The main conclusion that NPM1's biophysical states directly impact its interaction strength with ribosome intermediates (and thereby nucleolar dynamics) should be further strengthened as listed below:

      1). Given the nucleolus's complexity, an additional GC factor, or/and one more marker of other nucleolar regions, should be examined to substantiate the proposed impact of LSU-associated factors on nucleolar morphology (Figures 3, 4).

      We thank the reviewer for this very important point. We have now included representative images for representative hits in major phenotypic clusters co-stained for SURF6, another GC marker, which shows similar localization patterns as NPM1 (Fig. S4B). For other nucleolar subcompartments, we have included images obtained from a cell line harboring endogenously tagged FBL-mNeonGreen (a marker for the DFC) for representative hits (Fig. S4A). We see a similar overall distribution of the DFC within the GC (i.e. DFCs distribute to fill the area of the disrupted GC), confirming our screen results. We look forward to further examining the changes in nucleolar subcompartment architecture in future work.

      As additional support, we note that we probed NOG2, NOP53, and NOP2 in our IF results, all of which are GC-localized factors. We see a very similar distribution for these factors in our hits as for NPM1 (see Fig. S8D). In addition, FISH data for pre-rRNA precursors show similar morphological patterns as NPM1, further confirming our results (Fig. S7). We have noted this in text and have also included representative images in supplement.

      2). Additional experiments are needed to support the proposed model that ribosomal intermediates, especially the pre-LSU complexes could determine nucleolar biophysical properties through the interaction with NPM1. Their direct interaction by biochemical assays should be provided. Also, when analyzing the interaction with other nucleolar factors, the authors should provide data that show NPM1 mutant expression levels were comparable to endogenous levels (Figures 4, 6).

      We agree that directly probing NPM1's interactions with LSU precursors is critical to supporting our model, and we have addressed this through several complementary biochemical approaches. First, we performed immunoprecipitation of tagged NPM1 (NPM1-mScarlet, IP-ed using RFP-trap agarose) and assessed interaction with pre-LSU rRNA transcripts via Northern blot (Fig. 5D). We find that NPM1 interacts strongly with the 32S pre-rRNA. Second, we performed sucrose gradient sedimentation and find that NPM1 preferentially co-migrates with pre-60S complexes (Fig. 5B). Together with previous reports of NPM1-pre-LSU interactions, these data provide direct biochemical support for the proposed interaction.

      To test whether interaction strength with pre-LSUs could regulate NPM1 dynamics, we next asked whether our NPM1 mutants that differ in their dynamics in turn interact differentially with pre-LSU complexes. Using co-IP Northern blot for ITS2 and sucrose co-sedimentation, we find that NPM1 mA3 pulls down more 32S and co-sediments more robustly with pre-60S complexes, while NPM1 mB2 shows reduced association (Fig. 5D, E; Fig. S10F, G). These data support that the strength of the NPM1-pre-LSU interaction is a determinant of NPM1 exchange dynamics, and, by extension, of nucleolar biophysical properties.

      Exogenous mutant NPM1 is expressed at approximately 10% of endogenous levels (Fig. S10A). We address this in two ways. First, all interaction comparisons are made between WT and mutant exogenous constructs, not against endogenous NPM1, controlling for expression level differences. Second, we observe similar effects on interactions both in the presence of endogenous NPM1 and in null backgrounds, indicating that the differences we detect reflect NPM1 mutation, not expression level.

      3). Northern Blotting should be done to dissect which pre-rRNA intermediates interact with NPM1 and contribute to the nucleolar dynamics (Figures 4B, D, F). These additional experiments should be feasible within a reasonable timeframe.

      We agree with the reviewer and have performed northern blots for major hits in our different nucleolar phenotypes, and results reinforce what we see by FISH and qPCR (Fig. S6B). Briefly, depletion of the “RNA Exosome” hit SKIV2L2 results in smearing of pre-rRNA precursors that harbor both ITS1 and ITS2 and an accumulation of the 12S, in keeping with its role in end-processing of these transcripts. For “Other” hit PHF5A, we see an enrichment for 47S/45S/41S species, consistent with an early precursor stall. Notably, we do not see this phenotype for depletion of “Other” hit CNOT1, which suggests multiple processing defects may lead to a similar nucleolar phenotype. Treatment with PolI inhibitor CX5461 shows a depletion in ITS1 containing transcripts, and minimal impact on ITS2-containing transcripts, similar to FISH results. Lastly, depletion of “LSU” hits NOP53 and RPF2 leads to accumulation of the 32S and 12S species, in keeping with accumulation of abortive pre-LSUs.

      In addition, the authors should provide the code and the hardware control procedures for HiT-FRAP to ensure reproducibility.

      We thank the reviewer for this thoughtful suggestion. We have made our software available on GitHub (https://github.com/jess-sheu/colony_blob_bleacher) and archived on Zenodo

      (https://doi.org/10.5281/zenodo.20275447).

      According to the authors' statement, all the experiments are adequately replicated, and the statistical analysis is adequate.

      Minor comments:

      To enhance clarity and focus, consider the following:

      1). Simplifying the HiT-FRAP screening section (Fig. 1-3) would emphasize the significant findings.

      We have simplified text throughout to better highlight significant findings.

      2). Expanding analysis and experimental validation could help to solidify the interdependency between rRNA / ribosome precursors and the NPM1- driven nucleolar dynamics (Fig. 4-5). Indeed, additional experiments suggested above in the major concerns should be supplemented here.

      We have performed additional experiments to demonstrate the interdependency between ribosomal precursors and their interaction with NPM1 in shaping nucleolar dynamics, as described above.

      Reviewer #1 (Significance (Required)):

      This work has established a powerful toolkit, named HiT-FRAP, to identify factors involved in the organization and regulation of the membrane-less nucleolus, which will be useful for understanding the complexity not only the nucleolus, but likely other condensates in cells in the future. Using this platform and with the Granular Component (GC)-localized NPM1 as an indicator of nucleolar morphology, the authors found that the biophysical properties of the nucleolus are sensitive to the ordered assembly of ribosomes, in particular the LSU maturation steps at the GC. This finding is important as it suggests the interdependency between the dynamic rRNA processing and the functional assembly and morphology of the nucleolus. Further studies are warranted to analyze the dynamics of other nucleolar constituents, particularly those localized at other sub-nucleolar regions, to fully depict how exactly the nucleolar function is coordinated with its biophysical properties.

      Reviewer #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: The nucleolus is a multiphase biomolecular condensate whose primary function is ribosome biogenesis. There are mounting evidences that the material state of condensates is important for their function. Here the authors have probed how the material property of the nucleolus responds to inhibitions of ribosome biogenesis.

      They have assessed nucleolar dynamics (molecular diffusivity) of a nucleolar protein, NPM1, by fluorescence recovery after photobleaching (FRAP). NPM1 is a protein that labels the periphery of the nucleolus (the so-called granular component, GC). (The nucleolus has 3 main subcompartments: the internal fibrillar centers, the middle dense fibrillar components, and the GC).

      One of the main findings of the work is that inhibition of late steps of ribosome biogenesis increases fluidity (faster recovery of NPM1), while inhibition of earlier (and inhibition of mRNA processing -but see below) rather increases rigidification (slower recovery). They then attempt to correlate what is interpreted as biophysical changes to pre-ribosomal intermediates and interaction with NPM1.

      Practically, the authors have produced reporter cell lines (HeLa) expressing stably (CRISPR engineering) mono or bi-allelic fluorescent version of NPM1; they have developed a powerful platform to conduct high throughout FRAP (this is really good); they have calibrated their system, initially with basic perturbations (ATP depletion, proteasome inhibition, etc), and then they focused on a family of trans-acting factors: the helicases, investigating systematically their effect on NPM1 recovery. They then extended their initial candidate-based screen to additional factors (using STRING interactions). This is nice and useful. Later in the work, they include in their analysis additional (morphological) features of nucleoli to cluster functionally their hits, as was done earlier by others in similar works. Finally, using recently published structural data (CryoEM), they attempt to correlate groups in the cluster with particular pre-ribosomal species. This part is less advanced and weaker than the initial part of the paper (screens and FRAP measurements).

      Major comments:

      -A major comment is with the compositional analysis of precursor intermediates that should be better defined. The stage assignment of particles is not quite as good as the screening part of the paper. At the RNA level, the authors provided FISH, as histograms of quantifications (see e.g. Fig 4D, and Fig SS6E). It would be necessary to show images, and to perform biochemistry. At the protein level, the authors provide immunostaining, but it does not really prove the detected protein is part of a particle,..

      We thank the reviewer for this important critique. We have taken several steps to address both the stage assignment and biochemical characterization concerns.

      Regarding stage assignment: We have consolidated our LSU phenotypic clusters (previously LSU1 and LSU2) into a single "late pre-LSU" group based on their shared features and proximity in PCA space. We want to be clear that this consolidation is intended to more accurately represent what our data can support: the screen reliably identifies factors whose perturbation produces a coherent late LSU assembly phenotype, and we do not wish to overstate the resolution of state assignment from imaging data alone. Sub-cluster distinctions are retained in supplementary materials for transparency. We have revised language throughout to reflect this framing.

      Regarding biochemical characterization of intermediates: We have now performed Northern blots on strong hits within our phenotypic groups (Fig. S6B). For LSU cluster hits, we observe accumulation of the 32S and 12S species, indicating a stall in ITS2 processing, which is directly consistent with our ITS2 FISH results and confirms that the RNA-level phenotypes reflect genuine pre-rRNA processing defects rather than indirect effects. For "Other" group factor PHF5A, we observe 47/45/41S accumulation consistent with an early processing stall. We have also added representative FISH images to Fig. S7 to allow direct visual assessment of RNA-level phenotypes.

      Regarding protein-level particle assignment: We agree that IF alone cannot establish that assembly factors are incorporated into discrete pre-ribosomal particles rather than existing as free factors. To more directly test whether the LSU cluster phenotypes reflect accumulation of genuine pre-ribosomal particles rather than mislocalized free factors we used NOP53 knockdown as a representative LSU cluster perturbation and, similar to RPF2 knockdown, see an accumulation of ITS2 and NOG2 in the nucleolus by FISH and IF (Fig. 4E). We then performed nuclear sucrose gradient fractionation and found that NOG2 co-migrates with the LSU peak and does not enrich in soluble fractions (Fig. 4F-H), supporting the interpretation that late pre-LSU particles accumulate in the nucleolus upon disruption of LSU cluster genes. Importantly, we also observe a strong decrease in co-sedimentation of NPM1 with the LSU peak upon depletion of NOP53 (Fig. 4G,H). This result, together with the Northern blot and FISH data, provides biochemical and cell biological evidence that the nucleolar phenotypes we identified by HiT-FRAP are associated with accumulation of late LSU assembly intermediates.

      -Another concern is to know if NPM: a GC component located periphery of the condensate and a late assembly factor is an appropriate marker for assessing the effects on nucleolar material state of all (including early and late) inhibitions.

      Would factors involved in earlier ribosomal assembly steps, and localized more internally would not be better tools to evaluate change in material states caused by alterations in early steps?

      We appreciate this important point and agree that NPM1 reports primarily on GC dynamics. However, we would argue this is a feature rather than a limitation for two reasons.

      First, the GC is the terminal assembly compartment through which pre-ribosomal particles must transit before nuclear export. Perturbations to earlier assembly steps, including FC/DFC-localized processes, likely propagate into GC dynamics, because stalled or aberrant particles accumulate in or are excluded from the GC. NPM1 FRAP thus functions as a downstream integrator of upstream assembly status, not only a reporter of GC-proximal events. This interpretation is consistent with our observation that depletion of early factors (and, therefore, depletion of downstream intermediates) do produce detectable NPM1 phenotypes in our screen. Second, the pattern of our screen results supports rather than undermines this logic: the striking enrichment of late LSU factors and near-complete absence of SSU hits is precisely what one would predict if NPM1 reports selectively on pre-LSU flux through the GC. A sensor that reported indiscriminately on all condensate perturbations would not produce this specificity.

      We do acknowledge, however, that NPM1 cannot report on material state changes that are compartmentally confined to the FC or DFC and do not propagate outward. Extending this approach to internal markers remains an important future direction. To clarify the scope of our readout, we have revised the text to specify that we are monitoring GC dynamics, and we have added representative images of fibrillarin localization in Supplemental Figure S4A to illustrate the relationship between DFC and GC compartments in our experimental system.

      -About the engineered cell lines used for screening by FRAP (Fig 1S): NPM1-mNeonGreen (biallelic with reduced expression of NPM1) and mScarlet (heterozygous): There is a need to characterize pre-rRNA processing in both cell lines to show they are not affected for ribosome biogenesis. This is important information since the entire work is based on these cells.

      We have performed a Northern blot across the cell lines used in this paper as compared to their parent cell line and see no substantial difference in rRNA processing. We have included this data as Supplemental Figure 1D.

      The screening cells are HeLa cells implying they are not physiologically regulated for p53. Nucleolar surveillance is a key regulatory surveillance loop triggered by ribosome biogenesis inhibitions leading to p53 stabilisation. How could this affect this work? Should key findings be confirmed in diploid p53 positive cells?

      We acknowledge that our choice of HeLa cells limits our ability to distinguish cell-type-specific responses from more universal mechanisms and have added an explicit discussion of cell choice in the main text. To begin exploring the impact of p53, we performed gene depletions for representative hits across phenotypic clusters in untransformed, diploid hTERT-RPE cells that were lentivirally-transduced with NPM1-mScarlet and assessed nucleolar morphological phenotypes at smaller scale (Figure S6C, Supplementary Text). At baseline, RPE cells show more and smaller nucleoli than HeLa cells, which may reflect a difference in basal nucleolar assembly and, potentially, ribosome biogenesis, in keeping with previous observations that transformed cells rely more heavily on ribosome biogenesis than non-transformed.

      Upon gene depletion, we found that hits from the "RNA exosome" cluster shows a different phenotype than seen in HeLa cells, where we observe less size difference and a marked decrease in eccentricity, which may reflect a p53 or cell type specific response. Depletion of the “Other” cluster gene PHF5A results in a milder though qualitatively similar phenotype as seen in HeLa cells, with nucleolar rounding and an increase in NPM1 intensity. Depletion of “LSU”-associated hits in RPE cells very robustly replicated most of the nucleolar features we observed in HeLa, which suggest that these are likely generalizable responses to LSU disruption. We have included this data in Supplementary Figure 5C. We note that we did not directly test whether p53 is stabilized upon depletion of our hits in RPE cells, and whether p53 activation feeds back on condensate dynamics remains an open area for future work. However, the concordance of LSU-associated phenotypes across HeLa and RPE cells, which differ substantially in p53 status, transformation state, and baseline nucleolar architecture, supports the generalizability of our core findings.

      -About factor depletion, e.g. helicases, it's important to consider direct versus indirect effects on ribosome biogenesis, the timeline of depletion should be well described in the paper. Apparently, most factors, including the helicases were depleted for 72 hours, this is very long considering most of them play important roles in essential processes for cell homeostasis implying severely reduced growth at the time of capture (and the possibility of indirect effects).

      We thank the reviewer for this important point. To directly address depletion timeline, we performed time courses for strong hits and monitored nucleolar morphology at 24 and 48 hour intervals (now included in Fig. S3D). Morphological changes begin to emerge by 48 hours across phenotypic classes; for the RPF2 LSU phenotype specifically, nucleolar expansion and decreased NPM1 intensity are detectable as early as 24 hours, inconsistent with a general stress response and more consistent with a direct downstream consequence of LSU assembly disruption. Moreover, despite all targeted genes being essential for homeostasis, phenotypic profiles are cluster-specific and associated with multiple genes of coherent function, which suggests that observed impacts are downstream of specific pathway inhibition rather than a general cellular stress response.

      -Another cause of concern is that some perturbations (factor depletion) affect very deeply nucleolar structure/morphology (eg uL2 depletion shown in Fig 2C); how easy/difficult was it to control/make sure that a correct area was obliterated in the FRAP experiment using the (remarkable) data-adaptive approach. For cases where the nucleolus was deeply affected how did you check that a significant nucleolar area had been selected for analysis? It would be good to describe this in the text.

      We manually ensured our segmentation protocol accurately captured nucleoli, defined by higher intensity regions of NPM1, for all depletion cases during screen development. As this is the key factor in ensuring where the bleach point is, most bleaches, even in disrupted cases, bleached the nucleolar interior. To address this point, we have included figures in the supplement (Fig. S4D) that show bleaching time courses for select highly disrupted hits uL2 and eL39.

      • Fig 6C, interaction of NPM1 constructs with pre-ribosomes: the authors have tested interaction with select nucleolar proteins (NOP53, NOP2, NOG2, and uL2), which is not the same as preribosomes.

      It would be important to see the interactions with precursors (Fig S9C, now histograms) please show the actual data, this was tested by qPCR, please show classical northern blots as RTqPCR have shown their limits in such applications.

      Indeed, we cannot distinguish between assembly factors/ribosomal proteins that are associated with NPM1 in their latent, non-pre-LSU bound state versus those that are part of a developing ribosome. We have addressed this gap in several ways. Firstly, we have performed IP-northern blots for tagged NPM1-mutants, as suggested, and find that the mA3 mutant co-IPs more 32S than WT, while the mB2 binds less (Fig. 5D). We also performed sucrose gradient analysis of pre-ribosomal complexes and find that the mA3 mutant co-sediments more with the pre-60S peak, while mB2 co-sediments less (Fig. 5E). These findings are consistent with in vitro findings in the field that B2 mediates interactions with rRNA, while A3 occludes B2 through intramolecular interactions. Collectively with our co-IP western data, we believe the evidence strongly suggests that NPM1 mutants interact differentially with pre-LSU complexes.

      -Minor comments:

      -The effects of mRNA processing disruption on nucleolar dynamics could be (is most likely) very indirect (the so-called "slow hits"). The respective time course of inhibitions is important to describe.

      We direct the reviewer to our response above for other phenotypes. For our "slow hit" / "Other" cluster, we also used the splicing inhibitor PladB as an orthogonal approach. Strikingly, nucleolar rounding was detectable within less than one hour of treatment, well before any general cell health effects would be expected, while dynamics changes required approximately 24 hours — suggesting that morphological and biophysical responses are kinetically separable and that the early morphological response is directly downstream of splicing inhibition. We have included a representative rounding timecourse in Fig. S8E.

      Reviewer #2 (Significance (Required)):

      -General assessment: strengths and limitations

      Strengths: -The automated platform for high throughput FRAP\

      -The authors develop a potentially interesting model where they attempt to connect rigidification/fluidity of a condensate to its function in assembly of large ribonucleoprotein complexes. -The manuscript reads very well; it has been prepared with great care (figures). Some complicated concepts are explained very well (Introduction/Discussion). Limitations: -particle stage assignment based on FISH and immunostaining only. The authors have not demonstrated that the LSU1 cluster = state F and LSU2 cluster = states G/H

      -Advance: -Technological advance, high throughput FRAP, a powerful platform to interrogate macromolecular diffusivity.

      -Several nucleolar screens have been conducted in the past (but at steady-state, not using FRAP), in these works textural and morphological features were used together with dimensionality reduction techniques to define functional clusters of genes that impact the homeostasis of the nucleolus. Often these references are cited but it could be useful to expand a bit on some of the earlier findings to bring the new ones in perspective. Some clusters (typically, the transcriptional cluster that disrupts the nucleolus; and the late binder ribosomal proteins) have been well identified before.

      -Audience: Cell biologists, scientists involved in ribosome biogenesis research, scientists with an interest in helicases. The growing condensate community.

      -Describe your expertise: ribosome biogenesis, structure-function relationships in the nucleolus, technological development in microscopy.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors use high throughput FRAP (HiT-FRAP) in arrayed genetic screens of HeLa cells expressing nucleophosmin (NPM1)-fluorescent protein variants to monitor the biophysical properties of the nucleolus in response to genetic perturbations. HiT-FRAP uses a data adaptive imaging strategy to automatically identify and photobleach fluorescently labeled organelles in living cells and acquire movies for FRAP. Quantitative analysis of FRAP curves include t1/2 and mobile fraction. NPM1 was monitored since it is an important nucleolar scaffolding protein that is thought to interact with many pre-ribosome intermediates.

      The authors depleted 65 RNA helicases (+ 9 pairs) with siRNA and found that 15 of them either increased or decreased t1/2. Knockdowns were confirmed with western blotting. RNA helicase knockdowns with faster NPM1 diffusion were associated with large subunit (LSU) assembly. Most RNA helicase knockdowns with slower NPM1 diffusion were associated with early rRNA processing via the small subunit (SSU) intermediate. The authors screened an additional 290 gene depletions of many ribosomal proteins and assembly factors. With this expanded set of perturbations, they categorized nucleoli based on four morphological features in addition to t1/2 and mobile fraction. Using principal component analysis (PCA), the authors identified clusters of genes with similar effects on NPM1 dynamics and nucleolar morphology. From this secondary screen, the majority exhibited slower NPM1 dynamics. The knockdowns associated with faster NPM1 dynamics were associated with LSU assembly, similar to the helicase experiments. The authors further analyzed several mutants of NPM1 to elucidate the likely interactions between the scaffolding protein and ribosome biogenesis factors. The accumulation of early ribosomal intermediates were associated with decreases in NPM1 dynamics, and accumulation of late intermediates led to increased NPM1 dynamics. The findings established a link between the biophysical properties of the nucleolus and the stages of ribosome biogenesis.

      Major comments:

      • The claims are supported by experimentation.
      • No additional experiments requested.
      • The experiments are adequately replicated, and statistical analysis is sufficient. • Methods are very detailed, which should facilitate reproducibility. Minor comments:
      • Prior studies are referenced appropriately.

      • A bit more coverage of background on the nucleolar scaffolding protein, nucleophosmin (NPM1) would be helpful in the introduction, perhaps in favor of the details on ribosome biogenesis o Paragraph 2 could be shorter or placed elsewhere

      We thank the reviewer for this suggestion and have now included some background on NPM1 in the introduction and have shortened paragraph 2.

      • Figures

      o In Figures 2 - 5: explicitly state in the figure caption what dotted lines are encircling (entire cell?)

      We have now included this in the figure captions (they encircle the nucleus).

      o In Figures 2 - 5: explicitly state what the mp-inferno LUT intensity in the images is quantitating (amount of NPM1?)

      We have now included this in the figure captions (NPM1/mScarlet intensity).

      o Figure 7: more detail in the figure caption

      We have now expanded our model figure caption.

      • The paper is quite dense with a lot of nice work, discussing many different genetic perturbations. It feels a bit overwhelming, and I think the biological significance gets somewhat lost in the presentation of all the data. Perhaps some of the presentation of results can be moved to the supplement in favor of a "leaner" main text. Currently, there are only figures in the supplement, but I feel that some of the text that is not central to the key conclusions can be moved to the supplement. I found myself getting a bit bogged down and having to re-read several times to catch the takeaway messages. Some of the clarifying statements that are found in the discussion section can be moved to the results section. In short, some reorganization would help with readability. One suggestion is to move the Inhibition of rRNA transcription or the RNA exosome leads to nucleolar fragmentation and/or the Perturbation of mRNA processing pathways results in slowed NPM1 dynamics and accumulation of rRNA precursors in the nucleolus to the supplement.

      We thank the reviewer for this helpful suggestion. Due to this and other reviewers, we have now simplified discussion of phenotypic groups, including combining the “LSU” phenotypes into a single group and discussing LSU1/2 in the supplementary text. In addition, while we have chosen to keep the “rRNA transcription/exosome” and “Other” descriptions in the main text, they have been condensed and included in one main section with the other ribosome biogenesis phenotypes to highlight this key takeaway. Remaining discussion of phenotypes is now in supplemental text, as suggested.

      Reviewer #3 (Significance (Required)):

      • General Assessment: The main claim of the paper is that nucleolar phenotype (measured by morphology and NPM1 diffusivity) is correlated with stages in ribosome assembly - i.e. the stage of ribosome assembly determines the biophysical properties of the nucleolus. A strength of the study is the wide range of genetic perturbations tested enabled by the high throughput FRAP. With FRAP, I do worry a bit about using t1/2 as the sole dynamic measurement, but it is not a deal breaker. The authors introduce morphology as another way to characterize the nucleoli. • The claims are well supported by extensive experiments and data. The experiments are well designed, and proper controls were conducted. To validate the method, the authors used perturbations of NPM1 dynamics from the literature including ATP depletion, blocking glycolysis and oxidative phosphorylation, inhibition with MG132, and treatment with sodium arsenite. They observed slower NPM1 diffusivity under all validation conditions. • Advance: The authors have introduced a high-throughput technique for extracting diffusivity with FRAP, yielding a lot of data, but I think the paper suffers a bit in trying to present so much data in the main text. The mechanistic biological insights are compelling but get a bit overshadowed. Improved organization can help the messages come across more clearly. • To my knowledge, there is not a similar study in the literature as the detailed mechanisms of ribosome biogenesis are not well studied. • Audience: The audience for this manuscript seems to be biophysical researchers, thought there may be broader interest due to the wide screening of genetic perturbations. • Expertise: I have evaluated this manuscript from the perspective of a single-molecule biophysicist that studies protein-protein interactions between ribosome biogenesis factors. I am not an expert in FRAP, but I use FCS.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors use high throughput FRAP (HiT-FRAP) in arrayed genetic screens of HeLa cells expressing nucleophosmin (NPM1)-fluorescent protein variants to monitor the biophysical properties of the nucleolus in response to genetic perturbations. HiT-FRAP uses a data adaptive imaging strategy to automatically identify and photobleach fluorescently labeled organelles in living cells and acquire movies for FRAP. Quantitative analysis of FRAP curves include t1/2 and mobile fraction. NPM1 was monitored since it is an important nucleolar scaffolding protein that is thought to interact with many pre-ribosome intermediates.

      The authors depleted 65 RNA helicases (+ 9 pairs) with siRNA and found that 15 of them either increased or decreased t1/2. Knockdowns were confirmed with western blotting. RNA helicase knockdowns with faster NPM1 diffusion were associated with large subunit (LSU) assembly. Most RNA helicase knockdowns with slower NPM1 diffusion were associated with early rRNA processing via the small subunit (SSU) intermediate. The authors screened an additional 290 gene depletions of many ribosomal proteins and assembly factors. With this expanded set of perturbations, they categorized nucleoli based on four morphological features in addition to t1/2 and mobile fraction. Using principal component analysis (PCA), the authors identified clusters of genes with similar effects on NPM1 dynamics and nucleolar morphology. From this secondary screen, the majority exhibited slower NPM1 dynamics. The knockdowns associated with faster NPM1 dynamics were associated with LSU assembly, similar to the helicase experiments. The authors further analyzed several mutants of NPM1 to elucidate the likely interactions between the scaffolding protein and ribosome biogenesis factors. The accumulation of early ribosomal intermediates were associated with decreases in NPM1 dynamics, and accumulation of late intermediates led to increased NPM1 dynamics. The findings established a link between the biophysical properties of the nucleolus and the stages of ribosome biogenesis.

      Major comments:

      • The claims are supported by experimentation.
      • No additional experiments requested.
      • The experiments are adequately replicated, and statistical analysis is sufficient.
      • Methods are very detailed, which should facilitate reproducibility.

      Minor comments:

      • Prior studies are referenced appropriately.
      • A bit more coverage of background on the nucleolar scaffolding protein, nucleophosmin (NPM1) would be helpful in the introduction, perhaps in favor of the details on ribosome biogenesis
      • Paragraph 2 could be shorter or placed elsewhere
      • Figures
        • In Figures 2 - 5: explicitly state in the figure caption what dotted lines are encircling (entire cell?)
        • In Figures 2 - 5: explicitly state what the mp-inferno LUT intensity in the images is quantitating (amount of NPM1?)
        • Figure 7: more detail in the figure caption
      • The paper is quite dense with a lot of nice work, discussing many different genetic perturbations. It feels a bit overwhelming, and I think the biological significance gets somewhat lost in the presentation of all the data. Perhaps some of the presentation of results can be moved to the supplement in favor of a "leaner" main text. Currently, there are only figures in the supplement, but I feel that some of the text that is not central to the key conclusions can be moved to the supplement. I found myself getting a bit bogged down and having to re-read several times to catch the takeaway messages. Some of the clarifying statements that are found in the discussion section can be moved to the results section. In short, some reorganization would help with readability. One suggestion is to move the Inhibition of rRNA transcription or the RNA exosome leads to nucleolar fragmentation and/or the Perturbation of mRNA processing pathways results in slowed NPM1 dynamics and accumulation of rRNA precursors in the nucleolus to the supplement.

      Significance

      • General Assessment: The main claim of the paper is that nucleolar phenotype (measured by morphology and NPM1 diffusivity) is correlated with stages in ribosome assembly - i.e. the stage of ribosome assembly determines the biophysical properties of the nucleolus. A strength of the study is the wide range of genetic perturbations tested enabled by the high throughput FRAP. With FRAP, I do worry a bit about using t1/2 as the sole dynamic measurement, but it is not a deal breaker. The authors introduce morphology as another way to characterize the nucleoli.
      • The claims are well supported by extensive experiments and data. The experiments are well designed, and proper controls were conducted. To validate the method, the authors used perturbations of NPM1 dynamics from the literature including ATP depletion, blocking glycolysis and oxidative phosphorylation, inhibition with MG132, and treatment with sodium arsenite. They observed slower NPM1 diffusivity under all validation conditions.
      • Advance: The authors have introduced a high-throughput technique for extracting diffusivity with FRAP, yielding a lot of data, but I think the paper suffers a bit in trying to present so much data in the main text. The mechanistic biological insights are compelling but get a bit overshadowed. Improved organization can help the messages come across more clearly.
      • To my knowledge, there is not a similar study in the literature as the detailed mechanisms of ribosome biogenesis are not well studied.
      • Audience: The audience for this manuscript seems to be biophysical researchers, thought there may be broader interest due to the wide screening of genetic perturbations.
      • Expertise: I have evaluated this manuscript from the perspective of a single-molecule biophysicist that studies protein-protein interactions between ribosome biogenesis factors. I am not an expert in FRAP, but I use FCS.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The nucleolus is a multiphase biomolecular condensate whose primary function is ribosome biogenesis. There are mounting evidences that the material state of condensates is important for their function. Here the authors have probed how the material property of the nucleolus responds to inhibitions of ribosome biogenesis. They have assessed nucleolar dynamics (molecular diffusivity) of a nucleolar protein, NPM1, by fluorescence recovery after photobleaching (FRAP). NPM1 is a protein that labels the periphery of the nucleolus (the so-called granular component, GC). (The nucleolus has 3 main subcompartments: the internal fibrillar centers, the middle dense fibrillar components, and the GC).

      One of the main findings of the work is that inhibition of late steps of ribosome biogenesis increases fluidity (faster recovery of NPM1), while inhibition of earlier (and inhibition of mRNA processing -but see below) rather increases rigidification (slower recovery). They then attempt to correlate what is interpreted as biophysical changes to pre-ribosomal intermediates and interaction with NPM1. Practically, the authors have produced reporter cell lines (HeLa) expressing stably (CRISPR engineering) mono or bi-allelic fluorescent version of NPM1; they have developed a powerful platform to conduct high throughout FRAP (this is really good); they have calibrated their system, initially with basic perturbations (ATP depletion, proteasome inhibition, etc), and then they focused on a family of trans-acting factors: the helicases, investigating systematically their effect on NPM1 recovery. They then extended their initial candidate-based screen to additional factors (using STRING interactions). This is nice and useful. Later in the work, they include in their analysis additional (morphological) features of nucleoli to cluster functionally their hits, as was done earlier by others in similar works. Finally, using recently published structural data (CryoEM), they attempt to correlate groups in the cluster with particular pre-ribosomal species. This part is less advanced and weaker than the initial part of the paper (screens and FRAP measurements).

      Major comments:

      • A major comment is with the compositional analysis of precursor intermediates that should be better defined. The stage assignment of particles is not quite as good as the screening part of the paper.

      At the RNA level, the authors provided FISH, as histograms of quantifications (see e.g. Fig 4D, and Fig SS6E). It would be necessary to show images, and to perform biochemistry. At the protein level, the authors provide immunostaining, but it does not really prove the detected protein is part of a particle,.. - Another concern is to know if NPM: a GC component located periphery of the condensate and a late assembly factor is an appropriate marker for assessing the effects on nucleolar material state of all (including early and late) inhibitions. Would factors involved in earlier ribosomal assembly steps, and localized more internally would not be better tools to evaluate change in material states caused by alterations in early steps? - About the engineered cell lines used for screening by FRAP (Fig 1S): NPM1-mNeonGreen (biallelic with reduced expression of NPM1) and mScarlet (heterozygous): There is a need to characterize pre-rRNA processing in both cell lines to show they are not affected for ribosome biogenesis. This is important information since the entire work is based on these cells. The screening cells are HeLa cells implying they are not physiologically regulated for p53. Nucleolar surveillance is a key regulatory surveillance loop triggered by ribosome biogenesis inhibitions leading to p53 stabilisation. How could this affect this work? Should key findings be confirmed in diploid p53 positive cells? - About factor depletion, e.g. helicases, it's important to consider direct versus indirect effects on ribosome biogenesis, the timeline of depletion should be well described in the paper. Apparently, most factors, including the helicases were depleted for 72 hours, this is very long considering most of them play important roles in essential processes for cell homeostasis implying severely reduced growth at the time of capture (and the possibility of indirect effects). - Another cause of concern is that some perturbations (factor depletion) affect very deeply nucleolar structure/morphology (eg uL2 depletion shown in Fig 2C); how easy/difficult was it to control/make sure that a correct area was obliterated in the FRAP experiment using the (remarkable) data-adaptive approach. For cases where the nucleolus was deeply affected how did you check that a significant nucleolar area had been selected for analysis? It would be good to describe this in the text. - Fig 6C, interaction of NPM1 constructs with pre-ribosomes: the authors have tested interaction with select nucleolar proteins (NOP53, NOP2, NOG2, and uL2), which is not the same as preribosomes. It would be important to see the interactions with precursors (Fig S9C, now histograms) please show the actual data, this was tested by qPCR, please show classical northern blots as RTqPCR have shown their limits in such applications.

      Minor comments:

      • The effects of mRNA processing disruption on nucleolar dynamics could be (is most likely) very indirect (the so-called "slow hits"). The respective time course of inhibitions is important to describe.

      Significance

      General assessment: strengths and limitations

      Strengths:

      • The automated platform for high throughput FRAP
      • The authors develop a potentially interesting model where they attempt to connect rigidification/fluidity of a condensate to its function in assembly of large ribonucleoprotein complexes.
      • The manuscript reads very well; it has been prepared with great care (figures). Some complicated concepts are explained very well (Introduction/Discussion).

      Limitations:

      • particle stage assignment based on FISH and immunostaining only. The authors have not demonstrated that the LSU1 cluster = state F and LSU2 cluster = states G/H

      Advance:

      • Technological advance, high throughput FRAP, a powerful platform to interrogate macromolecular diffusivity.
      • Several nucleolar screens have been conducted in the past (but at steady-state, not using FRAP), in these works textural and morphological features were used together with dimensionality reduction techniques to define functional clusters of genes that impact the homeostasis of the nucleolus. Often these references are cited but it could be useful to expand a bit on some of the earlier findings to bring the new ones in perspective. Some clusters (typically, the transcriptional cluster that disrupts the nucleolus; and the late binder ribosomal proteins) have been well identified before.

      Audience: Cell biologists, scientists involved in ribosome biogenesis research, scientists with an interest in helicases. The growing condensate community.

      Describe your expertise: ribosome biogenesis, structure-function relationships in the nucleolus, technological development in microscopy.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This manuscript has presented a high-throughput fluorescence recovery after photobleaching (HiT-FRAP) platform to screen genes affecting the dynamics of the nucleolar scaffold nucleophosmin (NPM1). The platform included the siRNA-based screening of 65 RNA helicases, 9 phylogenetically related helicase pairs, and 290 ribosomal proteins along with selected assembly factors. These factors were classified as those accelerating or decelerating NPM1 dynamics based on the t1/2 measurements. Combined with nucleolar morphological changes, the authors identified that depletion of early-stage (A-F) and later-stage (G-H) LSU assembly factors resulted in different nucleolar phenotypes, suggesting the pre-ribosome assembly can impact nucleolar morphology. Further exploring the potential mechanism suggested that the NPM1's intrinsically disordered region (IDR) contributed to the nucleolar organization and dynamics.

      Together, this well-designed study uncovered that the ribosome assembly, both the early and late ribosomal precursors can influence biophysical properties of the nucleolus. Below please find our concerns for the authors to consider to strengthen the major conclusions.

      Major comments:

      The main conclusion that NPM1's biophysical states directly impact its interaction strength with ribosome intermediates (and thereby nucleolar dynamics) should be further strengthened as listed below:

      1. Given the nucleolus's complexity, an additional GC factor, or/and one more marker of other nucleolar regions, should be examined to substantiate the proposed impact of LSU-associated factors on nucleolar morphology (Figures 3, 4).
      2. Additional experiments are needed to support the proposed model that ribosomal intermediates, especially the pre-LSU complexes could determine nucleolar biophysical properties through the interaction with NPM1. Their direct interaction by biochemical assays should be provided. Also, when analyzing the interaction with other nucleolar factors, the authors should provide data that show NPM1 mutant expression levels were comparable to endogenous levels (Figures 4, 6).
      3. Northern Blotting should be done to dissect which pre-rRNA intermediates interact with NPM1 and contribute to the nucleolar dynamics (Figures 4B, D, F). These additional experiments should be feasible within a reasonable timeframe. In addition, the authors should provide the code and the hardware control procedures for HiT-FRAP to ensure reproducibility. According to the authors' statement, all the experiments are adequately replicated, and the statistical analysis is adequate.

      Minor comments:

      To enhance clarity and focus, consider the following:

      1. Simplifying the HiT-FRAP screening section (Fig. 1-3) would emphasize the significant findings.
      2. Expanding analysis and experimental validation could help to solidify the interdependency between rRNA / ribosome precursors and the NPM1- driven nucleolar dynamics (Fig. 4-5). Indeed, additional experiments suggested above in the major concerns should be supplemented here.

      Significance

      This work has established a powerful toolkit, named HiT-FRAP, to identify factors involved in the organization and regulation of the membrane-less nucleolus, which will be useful for understanding the complexity not only the nucleolus, but likely other condensates in cells in the future. Using this platform and with the Granular Component (GC)-localized NPM1 as an indicator of nucleolar morphology, the authors found that the biophysical properties of the nucleolus are sensitive to the ordered assembly of ribosomes, in particular the LSU maturation steps at the GC. This finding is important as it suggests the interdependency between the dynamic rRNA processing and the functional assembly and morphology of the nucleolus. Further studies are warranted to analyze the dynamics of other nucleolar constituents, particularly those localized at other sub-nucleolar regions, to fully depict how exactly the nucleolar function is coordinated with its biophysical properties.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      RNAi is remarkably efficient in planaria, yet no mechanism for the amplification of the RNAi signal has thus far been observed. In this manuscript, the authors analyse the mechanisms of RNAi spread in planaria. Starting from some basic observations on the identity of the Dicer and Argonaute proteins required for RNAi, the authors performed a set of elegant experiments to conclude that cycling stem cells likely take up dsRNA and excrete Ago-siRNA complexes, which are then taken up by other cells to mediate RNAi. In addition, the authors provide compelling evidence that RNAi is indeed independent of an amplification mechanism.

      Overall, I found the experiments and results compelling and the manuscript a pleasure to read. I have only a few suggestions for consideration, none of which are essential to support the main conclusions:

      • What does the arrest of stem cell proliferation do to the expression of RNAi genes (with and without dsRNA stimulation)?
      • Page 9: top panel. Is there a control that the dsRNA generated by RNaseIII is functional? I.e. that the defect is indeed due to an uptake effect and not the quality of the siRNA preparation itself? (In our hands silencing of siRNA prepared with bacterial RNaseIII has not been efficient at all). As a side note: no method is provided for the RNaseIII treatment.
      • Have the authors analyzed which of the Argonautes are present in the preparations generated with Q-sepharose?

      Data presentation:

      • For all figure legends: please make sure to state animals, number of repeats, define boxplots and what the individual data points represent. Please provide statistics where quantitative statements are made.

      Minor points:

      • First paragraph results: The statement that Ago1 and 3 were "closer to the nematode-specific WAGOs" does not seem correct, (horizontal distance to the miRNA-AGOs is still lower than the the WAGOs). I suggest removing the statement.
      • Use of checkmarks: please define when a checkmark vs cross was indicated? E.g., does a checkmark indicate that 100% of the animals showed efficient RNAi, or a majority of animals?
      • Many of the legends contain conclusions. While this may be a matter of taste/style, I would suggest to introduce conclusions only sparingly, if at all, in the legends
      • Some of the font sizes are rather small on print size (e.g. Fig 1A, S4i). In Fig 1A the black font on dark blue background is hard to distinguish.

      Textual suggestion:

      • Abstract "that rely on dsRNA intermediates, such as viruses" > ".. such as those from viruses..."
      • Materials and Methods: The lowerscript numbers for the ion show as squares in my pdf.

      Significance

      Strength/weaknesses:

      I found the experimental support robust and well supported and I did not find weaknesses that jeopardize the conclusions.

      Significance:

      One of the most intriguing features of RNAi is the systemic spread of a silencing signal across an organism's body. This has received significant attention in C. elegans and plants, but for other organisms, this is much less well explored. Planaria have a very efficient RNAi response, which the authors propose is due to uptake of an initial dsRNA by stem cell and excretion of an Argonaute-siRNA complex, which is then taken up by distal cells in an endocytic mechanism. I find this an intriguing mechanism that to differ from mechanisms for RNAi spread observed in other organisms.

      The work will be of interest to those interested in small RNA pathways (and RNA biology in general) and has practical implications for scientists working on planaria. The fact that small RNAs spread in an Argonaute-siRNA complex in an organism should also be of interest for cell biologists.

      My field of expertise: Small RNA pathways and antiviral defense in insects. No experience working with planaria.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In RNA interference (RNAi), double-stranded RNA (dsRNA) is processed into small interfering RNAs (siRNAs), which can function locally or act as mobile RNA species that spread between cells. In organisms such as nematodes and plants, the underlying mechanisms and key factors involved in this process, including transporters such as SID-1, have been well characterized. While systemic RNAi has also been reported in other animals, the underlying mechanisms remain largely unclear. In this context, the authors focus on planarians as one such model to investigate these processes. In the planarian S. mediterranea, gene knockdown by dsRNA injection is commonly employed, and the RNAi effect is known to spread rapidly throughout the organism. However, given the absence of RNA-dependent RNA polymerase (RdRP), the mechanism by which RNAi signals are efficiently propagated remains unclear. In this study, the authors provide several important insights into this question.

      First, the authors carefully evaluated the duration of the RNAi effect. In addition, they systematically examined the involvement of known RNAi-related factors and demonstrated that this process depends on ago1 and ago3. Second, interestingly, the authors find that initiation of systemic RNAi depends on neoblasts. Third, Argonaute-siRNA complexes play a crucial role in systemic RNAi. This differs markedly from the nematode system, in which dsRNA itself is transported, highlighting an intriguing mechanistic distinction. Finally, the authors suggest that distinct Argonaute proteins may function at different stages of RNAi propagation. Ago1 + Ago3 play essential roles in the initial phase of systemic RNAi in neoblast, Ago3 but not Ago1 silences the target in the differentiated cells. While the phenomenon described here is highly interesting, the underlying mechanism remains to be fully elucidated. In particular, how different Argonaute proteins functionally coordinate with each other, especially with respect to the transfer of siRNAs between Argonaute complexes, is still unclear and represents an important direction for future studies.

      The study is supported by well-designed control experiments, and the results are consistent with and support the authors' conclusions.

      I have no major concerns about this manuscript. The study is well conducted, and I only have minor comments that could further improve the manuscript.

      (Minor) While the authors have examined the effects of irradiation on the donor, it would be interesting to test the reciprocal experiment in which the recipient is irradiated. In particular, assessing whether the addition of donor lysate to irradiated recipients can recapitulate the observed RNAi propagation would further strengthen the proposed model.

      (Minor) The purity of the AGO complexes obtained via the TraPR anion-exchange procedure is not entirely clear. The authors may consider providing additional evidence of purity (e.g., visualization of small RNAs with T4-PNK), which would strengthen the conclusions.

      (Minor) Figure 4H is not referred to in the main text. The authors may consider incorporating a description of this panel into the Results section for clarity.

      Significance

      Overall, given the substantial amount of data and the overall high quality of the present study, further mechanistic dissection would likely be beyond the scope of the current manuscript. I look forward to future work from the authors addressing these mechanistic questions in more detail. RNAi has been widely used in stem cell research in planarians. In light of the findings presented in this study, however, previous studies that combine UV irradiation and RNAi may warrant careful re-evaluation. In this regard, the present work has important implications and is likely to have a broad impact on the field.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, the authors set out to understand how dsRNA elicits a system-wide RNAi effect using planarians as a model system. This is an important question, because it gets at evolution of these processes in different animal models and because knowing more about how RNAi works can allow scientists to tweak their approach for a better knock down efficiency. Importantly, though the system-wide mechanism of RNAi is fairly well understood in C. elegans and in some plants, it isn't clear how conserved these mechanisms are. Some aspects of this paper are quite convincing, including identification of the responsible Argonaute and Dicer proteins. Further, the identification of potential Sid-1 homologs that may allow for import of dsRNA is new. However, the role for Ago-3 was recently reported in Sasidharan, et al (Science Advances, 2026), which is not cited in this manuscript. Perhaps more importantly, several key aspects of the argument set up in this paper are not adequately supported and there are key gaps in the mechanism proposed that prevent its publication in this form. Major and minor suggestions follow:

      Major issues:

      1. The argument that siRNAs must be generated in stem cells that are cycling is not well supported.

      a. The authors only use one approach to reduce stem cell numbers, lethal irradiation. In addition to causing loss of stem cells, lethal irradiation causes wide-spread DNA damage and organismal/cellular stress responses. By 6 days after lethal irradiation, other progenitor cells are lost as well. Epidermal progenitors are known to be very abundant and to play signaling and/or metabolic roles in planarian physiology, so their loss may also be impactful. The authors should consider other orthogonal approaches to eliminate stem cells and to rule out other potential mechanisms.

      b. The authors use camptothecin in planarians and claim that it reduces cell divisions of stem cells. To my knowledge, this drug has not been shown to work in planarians before. The concentration used is also higher than in published studies. The authors should show whether stem cells are lost after this drug treatment (through levels of stem cell markers or stem cell counting) and should clarify the timing of the treatment relative to the RNAi, which is not clear from the figure legend or methods section. The authors should also discuss possible alternative interpretations of this piece of data (e.g. potential off-target effects). Without more information, it is hard to interpret the data relative to the irradiation results. The authors also do not provide any insight into how or why dividing/cycling stem cells would be important for the systemic RNAi mechanism they propose.

      c. ago-1, ago-3, and dcr-2 were shown to be enriched in stem cells (Fig. 3C), but these genes are also expressed in differentiated cells in single-cell sequencing data. Therefore, it isn't intuitive that non-stem cells would lack the capacity to generate siRNAs.

      d. Is there a way to directly test the hypothesis that stem cells are only generated in stem cells, potentially by blocking transport in some way and then visualizing new siRNAs with a miRNA/siRNA version of ISH OR FACS and sequencing? If the Ago-1/3-siRNA complexes are indeed transported by EVs as per Sasidharan, et al then the ESCRT(RNAi) approach might be useful in blocking movement of siRNAs? Or, could the authors show that dsRNAs are preferentially taken up by stem cells using the type of experiment shown in Fig. S5H? 2. It isn't clear from the manuscript how the authors believe that Ago-1/3-siRNA complexes exit and enter cells. The diagram in Fig. 5F describes the complex as moving between cells either through vesicles or extracellularly. How do the authors propose that Ago-siRNA complexes pass through the plasma membrane given that they are not known to go through the secretory pathway? Or once endocytosed, how do they exit the vesicle? Uncertainty on these points makes the molecular mechanism proposed here seem poorly supported by the data provided in the paper. 3. One key result in the paper is the transplant of "AGO complexes" that are purified from lysate. The authors writing about this experiment implies that they are transplanting a fairly pure material representing these RNPs and no others. However, the approach described is unlikely to result in purification of highly specific protein complexes. At a minimum, gels that illustrate protein/complex purity should be provided. Preferably, though, mass spectrometry and sequencing would be provided to detail siRNAs and proteins in this sample. 4. The Sasidharan, et al (Science Advances, 2026) paper should be cited and also the findings of this paper should be put in the context of that work.

      Minor changes:

      1. In several experiments, quantitative assessment of impacts (e.g. eye size or ovo/opsin transcript levels) rather than subjective eye scoring would be preferable for rigor and for statistical analysis of changes rather than check/X (e.g. 4F-H).
      2. F1 and F2 terminology for regenerates is probably not accurate since F in those terms stands for "filial" and is used to denote offspring.
      3. The images in Figure 2 (A, C) are quite hard to see on the printed page. Using white for fluorescence might improve contrast and visibility.
      4. The element of time seems to be very important for transmission of the RNAi effect in sexual offspring. Instead of the claim that hatchlings from RNAi crosses have no effect (Fig. 2H), the detail provided in the results section seems to indicate that there is a time-limited effect. These findings should be clarified with progeny sorted by time of egg laying and with a better sense of time between RNAi injection and hatch. Further, even in animals that do regenerate eyes, it would be nice to see a quantification of transcript as a clearer readout of whether some knockdown persists.
      5. This is more of a curiosity question, but it would be interesting to know how the differences in Ago1/2/3 protein structure might relate to their function, particularly in terms of the PAZ and MID domains.

      Significance

      This paper provides some new insight into mechanisms underlying systemic RNAi in planarians. Some of the results are quite preliminary and the overall interpretation of data is not yet well founded. However, there are some highlights, including the potential identification of dsRNA transporters that will be interesting to those in the planarian field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors characterized the molecular mechanism of systemic RNAi in planarian Schmidtea mediterranea (Sme) through loss-of-function genetic perturbations. They genetically identified key protein factors involved in the siRNA pathway and assessed the systemic RNAi efficacy at the molecular level. Notably, they find that the proliferating stem cells (neoblasts) are specifically required for systemic RNAi in Sme. They further propose that the requirement of neoblasts in systemic RNAi is mediated by spreading of the RISC RNP to differentiated cells.

      Major Comments

      1. The authors show that in Sme systemic RNAi strictly relies on the presence of neoblasts, which is one of the most interesting finding. It is important to understand the mechanism, specifically whether neoblasts are generally required for dsRNA processing or for conferring mRNA slicing activity. Although the authors claimed in Figure 5 that neoblasts are required for siRNA biogenesis, the results provided do not directly support this claim. An alternative scenario is that dsRNA can still be processed into siRNA in the absence of neoblasts, but the resulting siRNA subsequently fails to function without the neoblast AGOs or other signals. To directly confirm that neoblasts are required for dsRNA processing, one additional experiment should be performed in which irradiated worms are injected with dsRNA, followed by small RNA cloning and sequencing to detect whether processed siRNAs are present.
      2. An alternative mechanism to interpret the role of neoblasts is that, instead of processing dsRNA and/or spreading RISC RNP, the neoblasts may function as regulatory cells that provide signals licensing the dsRNA processing and target slicing in differentiated cells. Under this scenario, the requirement for sid-1 and vha-16 could instead be interpreted as necessitate the dsRNA transfer from the initial uptake tissues (parenchyma for be injection or intestine for feeding) to the target tissues. To rule out this possibility, isolated neoblasts from naive donors could be transplanted into irradiated recipient worms who have been injected with dsRNA, and whether such transplantation can restore the systemic RNAi in the recipient animals could then be tested phenotypically. One caveat is that any positive result may be due to the proliferation of the donor neoblasts in the recipient. This can be addressed by performing the same transplantation but using neoblasts isolated from camptothecin-treated worms, which would limit the proliferative contribution.
      3. In Figure 5E, the authors show that recipient ago-3 is required for systemic RNAi, and they suggest in the Discussion a plausible model in which recipient AGO-3 is required for nuclear RNAi for transcriptional target repression. However, this result appears inconsistent with the results in Figure 1C-D, where ago-3 KD did not abolish systemic RNAi. This contradiction should be acknowledged in text and further investigated. One possible interpretation is that the presence of the neoblast ago-3 from the donor lysate may have an antimorphic effect and interferes with the recipient AGO(s) (presumably ago-1 in this case) during target silencing , implying that homogeneity of AGO(s), or at least homogeneity of ago-1, is required for such systemic RNAi. Although the underlying mechanism remains difficult to interpret, such hypothesis could be tested by injecting lysate from ago-3 KD donor into ago-3 KD recipient. If AGO homogeneity is indeed required, such transfer treatment should no longer abolish systemic RNAi in the recipient in Figure 5D. Additionally, the target genes used for the systemic RNAi in Figure 1C/D and Figure 5E are different. To exclude the possibility that this discrepancy is target-specific, either six1/2 should be tested in the whole worm RNAi assay in Figure 1 or opsin should be used in the transfer assay in Figure 5.
      4. The authors claim in Figure 4 that the systemic RNAi is mediated by secreted RISC. This claim is not unexpected, because naked siRNA generally suffers poor half-life in vivo and therefore must be stabilized by bounding to AGO to evade the endogenous ribonucleases. Nevertheless, the alternative hypothesis that the transferred RNAi is mediated by the spread of naked RNA, though unlikely, should be experimentally excluded. Specifically, the isolated RNA and lysate with protease in Figure 4F (which failed to induce RNAi in the host worm) should be tested to confirm whether they contain siRNAs. This can be done by cloning and sequencing the sRNA in the lysate.
      5. The authors assigned ago-1 and ago-3 to the siRNA pathway and ago-2 to the miRNA pathway. This is an important conclusion for subsequent sRNA studies in planarians. However, the evidence provided in the current manuscript is insufficient to exclude ago-2 from the siRNA pathway, especially given that DDH catalytic triad is present in AGO-2. The observed redundancy between ago-1 and ago-3 to maintain functional RNAi can only support the involvement of these two AGO genes in the siRNA pathway but does not exclude AGO-2. To more rigorously test whether ago-2 should be excluded from the siRNA pathway, double RNAi of ago-2 and ago-1, as well as of ago-2 and ago-3, should be performed, and ago-2 should only be excluded from the siRNA-pathway if such double KD do not further reduce the RNAi efficacy compared with individual KD.
      6. The results shown in Figure 1F, where exposure to exogenous dsRNA can enhance the endogenous transcription of ago-1 and ago-3 in Sme, are particularly interesting. The authors should discuss whether this phenomenon is related to nuclear RNAi. In addition, it has been reported that exposure to exogenous dsRNA can increase the AGO/DICER protein levels without increasing the mRNA level (PMID32194567), and this should be compared with the present findings. Importantly, the result also suggests a potential strategy to improve the Sme RNAi efficiency. Accordingly, it would be valuable to test whether the increased ago-1/3 transcript levels caused by introducing exogenous dsRNA can lead to higher RNAi efficacy, both in terms of target silencing depth and the duration of RNAi effectiveness.
      7. Figure 2I-J provides remarkable evidence that the systemic RNAi in Sme is independent of RdRP. This result should be highlighted in the final paragraphs of the Introduction and mentioned in the Abstract.

      Minor Comments:

      1. The authors show that ago-1 + ago-3 KD only slightly perturbed the miRNA levels. However, this observation can be interpreted in at least two ways: (a) these two AGO genes are not involved in the miRNA pathway; or (b) these two genes are expressed at low abundance (which was mentioned later in the paper), such that their KD only mildly perturb their associated miRNAs, especially if these miRNAs are also associated with AGO-2. Scenario (a) seems less likely to be true because ago-2 is enriched in neoblasts (Figure 3C), whereas many conserved miRNAs have been reported to be enriched in Xins in Sme (Sasidharan et al 2013). This issue should be therefore discussed. In addition, the gene expression levels of the three ago genes from previously published bulk RNAseq datasets should be included in the figure.
      2. The illustration in Figure 1A is not fully accurate. In the miRNA pathway, target repression also includes mRNA degradation (which is conventionally referred to as mRNA decay or mRNA destabilization), which is in fact the dominant mode of miRNA-mediated repression. Therefore, "mRNA decay" should be added in addition to "translational inhibition". In the siRNA pathway, mRNA degradation is not directly mediated by RISC itself, but by the downstream exonucleases (i.e., XRN-1); therefore, the term "mRNA slicing" should be used instead for the siRNA part. Additionally, it has been shown that C. elegans RDE-1 is also associated with miRNAs (PMID 36790166), so the functional assignment in the model should be adjusted accordingly.
      3. In Figure S1C, the authors claimed that ago-1 and ago-3 exhibit more divergent PAZ and MID domains according to the AF modeling. However, this divergence may simply reflect the lower sequence conservation of AGO-1 and AGO-3 relative to AGO-2, which is shown in the phylogeny in Figure S1A. To address this caveat, Robetta modeling should be performed for both the full-length proteins in the comparative modeling mode due to the length of AGO proteins, or de novo modeling of the PAZ and MID domain. Structural the alignment in reference to solved AGO structures such as 4W5Q or 6N4O should be shown. If the MID/PAZ domains divergency remains evident, it should be quantified using backbone RMSD relative to known AGO structures.
      4. In Figure S1C, a second structural view should also be included to better illustrate the AGO architecture. The duplex channel within the PIWI-MID lobe should be clearly visible in one of the views. The L2 domain, or at least helix-7, should be labeled. If possible, the relative position of helix-7 to the guide RNA should also be shown. All the predicted structural models should be included in the supplemental files.
      5. The authors suggest that the spread of functional RISC from neoblasts depends on EVs. The evidence involving vha-16 is convincing, but to directly validate the presence of EVs that cargo RISC, CsCl ultracentrifugation would be informative. Although this experiment is beyond the scope of the current manuscript, the need for direct EV validation should be discussed.
      6. In Figure 2G, the authors show that although zfp-1 restores the homeostatic mRNA level at week 5, its downstream target prog-1 and agat-3 fail to recover. It remains unclear whether this is due to the delay of newly translated zfp-1 to activate the downstream targets, or due to translational suppression of zfp-1. Therefore, the mRNA levels of prog-1 and agat-3 should be further monitored beyond week 5.
      7. In Figure 3, the authors use co-expression by in situ hybridization to demonstrate the expression of ago in neoblasts. To provide the whole-animal context, co-expression of smedwi-1 and ago genes should also be confirmed using the current Sme scRNAseq datasets.
      8. The authors proposed in the Discussion that AGO-1 may sponge unwound RNA duplex and this facilitates the dsRNA transfer. This interpretation seems unlikely, because the ago-1 single KD, which would abolish such dsRNA transfer, did not show phenotypes in terms of systemic RNAi defect. Also, such scenario suggests that loss-of-function of ago-1 may be antimorphic since the sponged dsRNA were released, and thus co-KD of ago-1 and ago-3 should result in more efficient RNAi. These concern should be discussed.
      9. The Discussion states that AGO-1 is required in the donor, but this is inconsistent with the results in Figure 5D, where ago-1 KD in the donor did not abolish RNAi in the recipients. This inconsistency between results and text should be corrected.
      10. In Figure S5I, the authors show that short dsRNA generated by RNase III digestion failed to induce systemic RNAi in sid-1 loss-of-function condition. However, the alternative explanation is that RNase III digestion produced short dsRNAs that may result in siRNA with suboptimal length for AGO loading or functioning. This caveat should be mentioned, and the length profile of the RNase III digestion products should be shown by high density urea gel electrophoresis or HPLC.
      11. In all the transfer assays, one concern is the lysate may contain viable neoblasts, so that any observed results could be attributed to the proliferation of the donor-derived neoblasts rather than the transfer of RNAi materials. Therefore, a cell viability test using Calcein AM or other equivalent assay should be conducted to confirm the absence of live cells in the lysate preparation protocol.
      12. In the second paragraph of the introduction, when comparing the siRNA and miRNA pathways, the difference in base-pairing configuration with the target site should be introduced with appropriate reference.
      13. In the last paragraph of the introduction, the claim that the results may have implications for the design of effective RNAi-based therapies is too vague. Given that the current therapeutic siRNA delivery methods are already robust in clinical applications, the authors should more specifically explain how their findings in Sme might inform therapeutic development.
      14. In the last sentence of the second last paragraph of the introduction and Figure S5I, "RNAse" should be corrected to "RNase".
      15. In the first Results subsection, the second last paragraph, first sentence, one left parenthesis is missing.
      16. Throughout the Discussion, the term "AGO-RNA". If the authors intend to express a distinction from RISC, how this terminology differs from RISC should be justified. Otherwise, RISC would be more appropriate.
      17. Statistical significance should be shown in Figure 2E, 2G, 3A, 3C, S2A, S2D, S2G, S3D, S5D, S5G, S5J.
      18. Molecular weight should be labeled in Figure S5L.
      19. In Figure 2J, where the y-axis indicates % prevalence, the down-facing bars (antisense reads) should also be labeled as positive values on the y-axis. Displaying them as negative percentages (-20%) is incorrect.
      20. The small RNA cloning procedure should be described in the Methods. Basic information of sRNA sequencing, including read numbers, biotype distribution, proportion mapping to the triggering dsRNA, should be included too.
      21. The methods used to measure RNA and protein concentrations should be included in the Methods section.
      22. The irradiation protocol, including dosage, should be included in the Methods.
      23. In the Methods section, subscripts of chemical formulas are rendered as squares throughout the text. This formatting issue should be corrected.
      24. In the Results section, first subsection, second paragraph and first sentence. The cited data should be Suppl Figure S2D, not the current S2A-C.
      25. The manuscript uses inconsistent formatting for supplemental figures (for example, "Suppl Figure S1B,C" versus "Suppl Figure 2A-C"). The formatting should be standardized.

      Significance

      Planarians have long been appreciated as a robust model organisms for studying gene function in animal regeneration, and one major advantage of this system is its highly efficient systemic RNAi. However, the molecular basis of the RNAi machinery has not been thoroughly investigated, and detailed RNAi efficacy hasn't been evaluated. This study therefore provides important value by characterizing the molecular components underlying systemic RNAi in Sme, which contributes to both fundamental understanding and to potential optimization of RNAi-based experiments in Sme.

      In addition, the manuscript reports that stem cells are required for systemic RNAi in differentiated cells in Sme, a finding that has not been described in other organisms. Although the underlying mechanism remains unresolved, this observation offers potentially important implications for both RNA biology and stem cell biology.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major comments: 1) The study focused on regulatory activity in a lung-derived cellular setting and was well executed. However, the degree that non-coding variation in lung elements, particularly alveolar basal epithelial cells, modeled by A549 cells, contributes genetic risk for COVID19 severity is unclear. Especially as non-coding variants in other contexts such as immune cells have been shown to be enriched for disease risk. To strengthen the choice for the A549 cellular context, the authors can assess enrichment for COVID19 severity heritability using stratified LD-score regression (PMID: 26414678) using A549/lung epithelium chromatin data (ATAC-seq, CHIP-seq) to check for enrichment polygenic signal or if the lung associated-risk is focused on a restricted set of genome-wide significant signals.

      The reviewer is correct that most analysis of non-coding variants to date has been in immune cells, as is the case for many GWAS studies. However, severe COVID-19 affects many systems, especially the lung alveolar epithelium, and so there is a pressing need for functional genomic studies that go beyond immune cells. We chose A549 due to its lung origin, experimental tractability, and availability of published datasets. While enrichment such as S-LDSC suggested by the reviewer would be a good indication to screen for cell types with enrichment in e.g. open chromatin, many of our STARR-seq hits were found in closed chromatin and so would have been missed by such an analysis. To further emphasize the importance of type II alveolar epithelial cells in severe COVID-19 progression, we added the following to the manuscript: * "Cell death and the innate immune response of type II alveolar epithelial cells, which also function as progenitors for type I epithelial cells, are the main driver of alveolar damage and acute respiratory distress syndrome in coronavirus infection (Bridges et al 2022 PMID 34404754; Qian et al 2013 PMID: 23418343)."*

      2) It would strengthen the manuscript to compare the results to prior analyses where overlap exists (eg PMID:36763080). Particularly it would be informative to address if nominated variants for signals have different variants operating in different cell types. Also, one prominent variant, rs17713054, that had previously been nominated to operate in lung through in silico predictions and CRISPR perturbations (PMID:34737427) appears to be non-significant in this STARR-seq analysis. Was a different variant nominated at this locus? Could the authors expand on methodological differences that could explain this difference?

      rs17713054 (chr3:45,818,159:G>A) was nominated by Downes et al (2021) based on in silico predictions. While the authors found rs17713054 resides in open chromatin and is a chromatin accessibility (ca)QTL, the variant did not validate in CRISPR perturbations. Deletion of the putative enhancer encompassing rs17713054 across 4 cell lines led to no detectible changes in expression of the predicted target gene LZFTL1. The lack of H3K27ac at the putative enhancer led the authors to conclude that this enhancer is not active in any of the lung epithelial cell lines tested, consistent with our STARR-seq results which suggest that rs17713054 is inactive in A549 cells.

      We nominated 6 amVars at the LZFTL1 locus (Table 1) and propose there are multiple functional variants with small effect sizes operating at this locus which together significantly contribute to risk. We have included an additional supplemental figure panel (Fig. S2H) showing a genome browser view of these variants. As suggested by the reviewer, we also compared our results to Jagoda et al. That study only reported allele-specific change and not baseline activity, it is thus possible that very weak signal (below our thresholds) can show up as allele-specific. This appears to be the case for at least one variant (rs35454877) which we call as inactive but nevertheless has a significant allele-specific activity (mpralm padj

      3) Given a subset of the prioritized variants originate from the credible set, were the amVars enriched in terms of posterior inclusion probability than the tested set? This technical information could be valuable for interpreting fine-mapping efforts.

      We did not observe an enrichment of posterior inclusion probabilities (PIPs) for the amVars or active variants compared to inactive variants. One reason could be that we primarily find variants in weak enhancers with moderate effect sizes which may be too subtle to be attributed a high PIP by GWAS due to insufficient statistical power. It is also possible that variants with a high PIP are active in other cell types. Fine-mapped variant sets already contain variants likely to be functional, so observing no difference between already statistically likely functional variants is perhaps not surprising. Another study testing melanoma risk variants similarly observed no statistically significant differences in PIPs between MPRA functional and non-functional variants (Long et al 2022 PMID: 36423637). We have included a supplemental plot of the PIP scores (Fig. S2G) for inactive, active and amVars and added this analysis in the first results section (see revised manuscript lines 181-186).

      4). Similarly, for the eQTL comparisons, what proportion of the amVar/eQTL pairs are directionally consistent (e.g. increased activity/increased expression)?

      For the 29 amVars, there are a total of 4689 combined eQTLs across all GTEx tissues. When filtering for lung, there are 180 eQTLs across the 29 amVars, whereby only 17/29 amVars have eQTLs in lung. For 16 of these 17 amVars, there is at least 1 eQTL in lung that is directionally concordant - listed in Table 1. Notably, however, almost all variants which have lung eQTLs with concordant direction also have lung eQTLs with discordant direction, suggesting the effects may be more complicated. When considering all lung eQTLs in GTEx v11, amVars were surprisingly enriched for discordant direction of effect (see figure below, left). However, we noticed this signal was driven entirely by the variants in the H2 haplotype block (as proposed by the reviewer in question 5), which includes many genes with varying effects which may be unrelated to our amVars. When excluding chr17, no enrichment was seen (see figure below, right). There was also no significant correlation between the effect size magnitude of eQTL and STARR-seq. Therefore, globally comparing amVars and eQTLs was not informative per se. We emphasize that we have few amVars (29), which makes subtle correlations/enrichments less likely to be detectable. Siraj et al. (2026) (PMID: 41741648), testing a much larger variant set than ours and in multiple cell lines, observed weak correlation between MPRA allelic effects and eQTL normalized effect size (Spearman;s p = 0.35), although these libraries were selected to include only fine-mapped eQTLs in high PIP, in comparison to our libraries which also include a large number of additional variants in LD. Overall, this suggests eQTL effect size is not a strong predictor for variant effects observed by MPRAs. We have included a discussion about this (see revised manuscript lines 186-190).

      5) Several of the variants implicated by STARR-seq, including several of the pairs with non-additive activity were associated with the MAPT locus. This locus has a common 900kb inversion in Europeans (PMID: 15654335), were these variants linked to the same H1/H2 haplotype?

      Indeed, all five prioritised variant pairs, as well as 5 amVars and 2 further variant pairs showing STARR-seq activity at this locus (Table 1, Table 2), are linked to the same (H2) haplotype. More specifically, all variants show high LD with the H2 haplotype-tagging SNP rs8070723‐G in European ancestry (r2 > 0.73) and are not linked to one of the H1 haplotype-tagging SNPs (rs242557-A, r2

      6) Were the variants with non-additive effects analyzed for transcription factor motifs?

      We looked for both motifs (FIMO and motifbreakR) and predictions of contribution scores using Malinois and AlphaGenome in the non-additive combinations, without finding any evidence for synergistic binding/activity. For example, see below the non-additive example in Fig. 3C (rs77819001_rs76667867), where the total activity prediction by Malinois is low (0.10-0.14), and there is no evidence of non-additive contribution scores as expected from the STARR-seq results. Because of the few examples, we cannot determine whether this is due to a systematic inability of the models to predict non-additivity, and therefore we chose not to present them. For transparency, we added the following sentence to the results: "For prioritized, non-additive variants pairs neither model identified an impact on transcription factor motifs that could explain the observed non-additivity. However, the few examples preclude drawing any general conclusions regarding the ability of these models to detect non-additivity."

      Minor comments: 1) The sentence in the methods, Variant selection and design section, "the 95% credible set from the second GenOMICC releases containing causal variants to 95% statistical probability," is somewhat unclear. Given that the next sentence describes the 99% credible set, the authors should use more consistent terminology.

      For the 3rd release we used the 99% credible set of variants to increase comprehensiveness of our library, meaning the list of variants contains causal variants to 99% statistical probability. In contrast, for the first and second release we used the 95% credible set as is the standard for fine-mapped variants. We clarified the phrasing in the methods as follows: "Fine-mapped severe COVID-19 risk variants encompassing causal variants to 95% statistical probability (95% credible set) from the first and second GenOMICC release2 and a more comprehensive 99% credible set of variants from the third GenOMICC release2 were included in the STARR-seq library."

      2) Some text in supplemental figures (Fig S6) is too small to be legible. Please either remove or adjust the figure size.

      We removed the variant IDs from figure panels S6A and S6B to aid readability.

      Reviewer #2

      Major comments: 1). The major conclusion is well supported by the main data presented; but additional clarification and extension in the discussion part may be helpful to determine the potential impacts and application of such conclusions especially related splicing isoform changes regulated by potential functional variants. "OPTIONAL" CRISPR editing for a couple of selected genes/variants will be helpful to confirm effects of these novel pathways.

      We thank the reviewer for the positive appraisal.

      Regarding splicing isoform changes, rs2297480 lies within the promoter-proximal region of two alternative FDPS isoforms which lack the penultimate exon encoding part of the catalytic domain. Therefore, we propose the variant could increase expression of a non-enzymatically functional FDPS isoform, which may compete with the functional isoform for substrate binding, thereby decreasing overall FDPS enzymatic activity. There are other examples of such "promoter usage" QTLs. We have rephrased this section and included references to studies supporting such a situation at other loci *"While speculative, global analyses have found examples where enhancer/promoter variants are proposed to lead to isoform expression changes (so called promoter usage QTLs), which may have disease implications (PMID 36037215, 30618377)" *

      While CRISPR editing could be interesting, it would require extensive additional resources and is outside of the scope of the current manuscript. As a significant proportion of our amVars are not within accessible chromatin nor overlapping active chromatin modifications, we expect these to be functional in a different cell type rather than A549. Identifying a suitable cell line for CRISPR editing would therefore be non-trivial. Furthermore, the small effect size of our hits suggests seeing clear effects of single variants on transcription may be hard, as genes can be controlled by multiple enhancers simultaneously.

      Minor comments: 1, please be specific about the proportion here in the text "Similarly, the proportion of active candidate sequences overlapping predicted ENCODE CREs in A549 cells was increased compared to inactive sequences (Fig. S2F)."

      We added the specific proportions to the text: "Similarly, the proportion of active candidate sequences (39.3%) overlapping predicted ENCODE CREs in A549 cells was increased compared to inactive sequences (23.2%) (Fig. S2E)."

      2, Out of 29 variants showing allele-specific effects, how many of them are close to the known TSS of candidate genes. Is IFNA is the only gene nearby these 29 variants?

      rs7041102/rs7040981 reside ~4.5-5.5kb from the TSSs of IFNA10 and IFNA16. More gene-proximal variants include rs2297490, residing within in the first intron of the reference FDPS isoform and in the promoter-proximal region of three further FDPS isoforms (discussed above). In addition, rs145951274 resides in the first intron of HCG27 and rs3130925 in the first intron of the reference MICB isoform and within the promoter region of an alternative MICB isoform. We have added information on the distance to the closest TSS for all 29 amVars in Table 1 and indicated whether the variant is intronic.

      3, out of 166 variants, what are genes with TSS closer to the 166 variants. It will be helpful to have a table or list of these genes since their promoter close to the significant variants

      Of 166 active variants, 20 are within 1kb of the nearest TSS. Of those, 16 occur in the 1kb upstream or 100bp downstream of the TSS, the other 4 variants are within 1kb downstream. We have added this information as a new column to Supplementary Table 3 now showing the distance to the nearest TSS for all single variants tested, and we have modified figure S2F (previously S2E) which now shows the comparison of TSS proximity for inactive, active, and amVars.

      4, Do these 16 combinations of variants pairs have genetic interaction in the population levels, i.e. epistasis?

      This is an intriguing point, but challenging to test and beyond the scope of this work. 12/16 variant pairs are in very high or perfect LD (Table 2) and therefore either both risk variants or neither will co-occur in the population. We therefore cannot test if two variants have epistatic (beyond additive) effects in the population, nor can we directly link individual variants to a biological phenotype and therefore test for epistatic effects on phenotypes. We are limited to testing for beyond additive, i.e. epistatic, regulatory effects in the context of the STARR-seq assay, which we show in Fig 3B.

      5, It needs more clarification why the risk allele of rs2297480 at the FDPS locus is associated with increased enhancer activity and decreased levels or activity of FDPS?

      We have addressed this point under major comment 1 in the context of enhancer/promoter-driven isoform switches as plausible disease mechanism.

      To clarify, the rs2297480 risk allele showed increased enhancer activity by STARR-seq. The variant lies within the promoter-proximal region of two alternative FDPS isoforms which lack the penultimate exon encoding part of the catalytic domain. Therefore, we propose the variant could increase expression of a non-enzymatically functional FDPS isoform, thereby decreasing overall FDPS enzymatic activity as the enzymatically inert isoform may compete with the functional isoform for substrate binding. We emphasize that this possible mechanism at FDPS is speculative.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This Manuscript by Dr. Weykopf and Friman et al applied STARR-Seq method to screen and identify function variants that determine the severity of COVID-19 patients in lung epithelial cell lines followed by functional validation. Both additive and non additive effects of various regulatory variants were evaluated. Furthermore machine learning models were applied to interpret allele specific variant effects. This is a pioneering work to identify functional variants on GWAS loci associated with severe COVID-19 with solid methods and modeling. sufficient literatures were cited and discussed. The major limitations were well discussed.

      Significance

      Major comments

      The major conclusion is well supported by the main data presented; but additional clarification and extension in the discussion part may be helpful to determine the potential impacts and application of such conclusions especially related splicing isoform changes regulated by potential functional variants. "OPTIONAL" CRISPR editing for a couple of selected genes/variants will be helpful to confirm effects of these novel pathways. .

      Minor comments

      1. please be specific about the proportion here in the text "Similarly, the proportion of active candidate sequences 153 overlapping predicted ENCODE CREs in A549 cells was increased compared to 154 inactive sequences (Fig. S2F)."
      2. Out of 29 variants showing allele-specific effects, how many of them are close to the known TSS of candidate genes. Is IFNA is the only gene nearby these 29 variants?
      3. out of 166 variants, what are genes with TSS closer to the 166 variants. It will be helpful to have a table or list of these genes since their promoter close to the signficant variants
      4. Do these 16 combinations of variants pairs have genetic interaction in the population levels, i.e. epistasis?
      5. It needs more clarification why the risk allele of rs2297480 at the FDPS locus is associated with increased enhancer activity and decreased levels or activity of FDPS?
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      To investigate the contribution of non-coding GWAS variants linked to COVID19 severity across individuals, Weykopf and colleagues assayed allelic differences in reporter activity using STARR-seq in the A549 lung epithelial cancer cell line. The authors prioritize 4,894 COVID19 severity associated variants from several GWAS for functional screening. Of these, 166 of the variants designed elements displayed significant enhancer activity, and 29 of these also displayed allele-modulated activity. These results were further contextualized through comparisons with lung-cell relevant chromatin marks. Additionally, a subset of variants in close proximity were analyzed for non-additive effects in the reporter assay. In addition, results were compared results to predictions using deep modeling approaches such as AlphaGenome and Malinois. This approach allows for a systematic characterization of these variants and expands on previous work focused on narrower sets of variants. While well written and results are presented clearly, several additions could help with placing results in context.

      Major comments:

      • The study focused on regulatory activity in a lung-derived cellular setting and was well executed. However, the degree that non-coding variation in lung elements, particularly alveolar basal epithelial cells, modeled by A549 cells, contributes genetic risk for COVID19 severity is unclear. Especially as non-coding variants in other contexts such as immune cells have been shown to be enriched for disease risk. To strengthen the choice for the A549 cellular context, the authors can assess enrichment for COVID19 severity heritability using stratified LD-score regression (PMID: 26414678) using A549/lung epithelium chromatin data (ATAC-seq, CHIP-seq) to check for enrichment polygenic signal or if the lung associated-risk is focused on a restricted set of genome-wide significant signals.
      • It would strengthen the manuscript to compare the results to prior analyses where overlap exists (eg PMID:36763080). Particularly it would be informative to address if nominated variants for signals have different variants operating in different cell types. Also, one prominent variant, rs17713054, that had previously been nominated to operate in lung through in silico predictions and CRISPR perturbations (PMID:34737427) appears to be non-significant in this STARR-seq analysis. Was a different variant nominated at this locus? Could the authors expand on methodological differences that could explain this difference?
      • Given a subset of the prioritized variants originate from the credible set, were the amVars enriched in terms of posterior inclusion probability than the tested set? This technical information could be valuable for interpreting fine-mapping efforts. Similarly, for the eQTL comparisons, what proportion of the amVar/eQTL pairs are directionally consistent (e.g. increased activity/increased expression)?
      • Several of the variants implicated by STARR-seq, including several of the pairs with non-additive activity were associated with the MAPT locus. This locus has a common 900kb inversion in Europeans (PMID: 15654335), were these variants linked to the same H1/H2 haplotype?
      • Were the variants with non-additive effects analyzed for transcription factor motifs?

      Minor comments:

      • The sentence in the methods, Variant selection and design section, "the 95% credible set from the second GenOMICC releases containing causal variants to 95% statistical probability," is somewhat unclear. Given that the next sentence describes the 99% credible set, the authors should use more consistent terminology.
      • Some text in supplemental figures (Fig S6) is too small to be legible. Please either remove or adjust the figure size.

      Significance

      The authors performed a systematic evaluation of COVID19 risk variants in a lung relevant cell line. This study expands the number of variants tested as well as explores them in the lung cellular context. As this study did not filter tested variants allowing for comprehensive integration with chromatin annotations and computational predictions. This provides nominates a short list of lung relevant variants for further investigation. This paper will be of interest to genetics community interested in basic research in COVID19 severity.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __ __We thank all reviewers for the valuable feedback and critical insight on our study. We acknowledge the concern that the manuscript, in its initial form, appeared descriptive and did not provide the mechanistic insight inferred from the current data. In the revised manuscript, we will (i) more clearly delineate what mechanistic inferences can be drawn from the existing data, (ii) expand our discussion of the caspase-independent mechanisms, and (iii) incorporate additional experiments/analyses aimed at identifying downstream effectors that mediate the observed phenotypes. In this revision plan, we have included six new figures addressing some of the major issues raised by reviewers.

      1. Specifically, to address questions about mechanistic insight, we generated stable ACSL1:HaloTag expressing hESCs. Currently presented as Figure 1A for reviewers____. __ACSL1 is a critical enzyme that catalyzes the first step of fatty acid oxidation at the outer mitochondrial membrane. Our previous analysis and work from the Opferman lab demonstrated that ACSL1 contains a BH3-like domain. Thus, we examined the effects of MCL-1 inhibition on the mitochondrial localization of this enzyme. Our findings pinpoint that MCL-1 inhibition is causing the displacement of ACSL1 from the mitochondria (__Figures 1B-C for reviewers). Our interpretations of the effects of MCL-1 inhibition are 2-fold: 1) as we show in our data, MCL-1 inhibition causes disruption of the mitochondrial cristae, altering the microenvironment for fatty acid oxidation, and 2) as seen in cancer cells, the MCL-1 inhibitor may also displace ACSL1 from the mitochondria. In the new version of the manuscript, we will focus on these 2 mechanisms as mechanistic outcomes of MCL-1 inhibition.
      2. We have included data of cells treated with Perhexilin (CPT1/2 inhibitor), and Etomoxir (CPT1a inhibitor) (Figure 2 for reviewers). This experiment determines whether direct perturbation the FAO pathway mimics the effects of the MCL-1i.
      3. We have assayed the effects of MCL-1 inhibition on oxygen consumption rates in NPCs. Currently presented as Figure 3 for reviewers.
      4. We will perform MCL-1:MICOS proximity ligation assays and/or immunoprecipitation assays to determine whether MCL-1 inhibitors disrupt the association of MCL-1 with MICOS. Preliminary data suggesting an association (albeit, very weak) are shown in Figure 4 for reviewers. __Reviewer #1____ (Evidence, reproducibility and clarity (Required)): __

      Summary: This study claims that beyond its canonical anti-apoptotic function, MCL-1 has essential non-apoptotic roles in human neurodevelopment. Pharmacologic inhibition of MCL-1 in human neural stem cells disrupts mitochondrial inner membrane architecture by destabilizing cristae and the OPA1-MICOS complex, leading to swollen mitochondria with disorganized cristae. These structural defects impair fatty acid oxidation and lipid droplet homeostasis, linking cristae integrity to metabolic competence. Independently of apoptosis or proliferation, MCL-1 inhibition selectively depletes intermediate neural progenitors, indicating a direct role in lineage progression. Overall, the work positions MCL-1 as a key regulator of mitochondrial structure-metabolism coupling that instructs neural progenitor identity and human neurogenesis.

      Overall: The study does a good job of using (in most assays) caspase inhibition (e.g., QVD treatment) to block apoptotic responses induced by MCL-1 inhibition. As a result, many of the phenotypes caused by inhibition are likely to be independent of caspase activation. As a result, this manuscript would be of interest to researchers that study the topics of the BCL-2 family and cell death signaling, mitochondrial bioenergetics and dynamics, neurodevelopment, and cellular metabolism. However, as currently presented the manuscript is only descriptive and lacks mechanistic insight.

      We thank Reviewer 1 for the insightful evaluation of our work. We are encouraged that the reviewer finds the study relevant to investigators in the fields of BCL-2 family biology, mitochondrial dynamics and bioenergetics, neurodevelopment, and cellular metabolism. We also thank the reviewer for pointing out the need to increase the mechanistic insight of our findings. As mentioned above, in the revised manuscript, we are proposing to address this.

      Major Concerns:

      1) The authors only use a single MCL-1 inhibitor and never use other non-targeting BH3-mimetics (such as venetoclax) as negative controls. This seems like a missed opportunity to demonstrate that the phenotypes observed are MCL-1 dependent.

      This is an excellent point. We will include venetoclax (ABT-199) to examine their effect on intermediate progenitors (TBR2 +) and early born neurons (BIII tubulin +).

      2) There is no mechanism proposed in this study other than reliance upon QVD as not affecting the phenotypes. As submitted, the manuscript only can speculate that these phenotypes are due to non-apoptotic roles of MCL-1 inhibition. The authors have missed an opportunity to explore MCL-1's non-apoptotic functions directly.

      Mechanistically, we propose MCL-1 is acting in 2 ways: 1) as we show in our data, MCL-1 inhibition causes disruption of the mitochondrial cristae, altering the microenvironment for fatty acid oxidation, and 2) as seen in cancer cells, MCL-1 inhibitors may also displace ACSL1 from the mitochondria.

      In the past few weeks, since receiving the initial reviews, we have focused on testing the 2nd possibility, since the accumulation of lipids was also seen in cancer cells (see PMID: 38503284). We have successfully generated stable ACSL1:HaloTag expressing hESCs (Figure 1A for reviewers). Our findings included here, ACSL1 is displaced from the mitochondria by MCL-1 inhibition in NPCs (Figures 1B-C for reviewers).

      Other concerns exist that weaken the impact of the study.

      1. Figure 1 should include the fact that QVD inhibition (shown in Sup Fig 2) does not obviate the phenotype induced by pharmacological inhibition of MCL-1 on mitochondrial morphology. We would like to clarify that QVD does prevent the phenotypes induced by MCL-1 inhibition on mitochondrial morphology. In Fig1B, we report an increase in volume and surface area at 24h and 48h along with a decrease in mitochondrial content at 48h when NPCs were treated with MCL-1i only. However, NPCs co-treated with QVD in Supp Fig 2B did not exhibit any significant morphological phenotypes on average or at min/max values. Reviewer 1 may be referring to Fig 1B’s corresponding min/max values presented in Supp Fig 2A where we reported an increase in __max __volume.

      Figure #

      Volume

      Surface Area

      Fig 1B (MCL-1i only, avg values)

      Increase (avg vol)

      increase (avg)

      Supp Fig 2B (MCL-1i+QVD)

      no change

      no change

      Supp Fig 2A (MCL-1i only, max/min values)

      increase (max vol)

      no change (max)

      For clarity, we will move Supplementary Fig 2A into Supplementary Fig 1.

      Figure 2 would benefit from evidence that caspase inhibition does not repress the phenotype on mitochondrial cristae morphology (volume and area). Furthermore, the FIB-SEM data are very hard to appreciate as the size precludes visualization of individual mitochondria.

      While we included the visualization of the segmented mitochondria and cristae (Figure 2C), as well as snapshots through the z-stack for segmented cristae only (Figure 2E) and segmented mitochondria separately (Supp Figure 3A) in the original manuscript, we are also now attaching the FIB-SEM 3D reconstruction videos (New Supplementary Videos 1-2 for reviewers) (1. Mito and cristae, 2. Cristae only, 3. Mito only) for ease of visualization purposes.

      Figure 3 reports that MIC60 and OPA1 appear to be downregulated in response to MCL-1 inhibition, but these appear to be more significant only when QVD is added. Why would the phenotype be obscured in the non-QVD setting (Fig. 2B&C). How does MCL-1 inhibition lead to changes in MIC60/MICOS/OPA1? This seems quite preliminary at this point.

      In Figures 3B and 3C, we report decreased protein levels of short-form OPA1 and MIC10 only, not MIC60. We argue that our data with QVD shows that the cell death function of MCL-1 (i.e., inhibiting cell death effectors from initiating the caspase cascade) is not the main trigger of the phenotypes we report (cristae dysregulation and fatty acid oxidation disruption), however, cells without a functional cristae and/or defects in FAO, may not be able to survive long-term. Thus, QVD treatment preserves these cells that may not survive the dismantling of such an essential structure. To confirm this, we have performed immunofluorescence of cleaved caspase 3 (Figure 5 for reviewers). These results show that indeed MCL-1 inhibition at the time points of our study doesn’t result in increased activation of Caspase-3. We reported similar results of MCL-1 inhibition in oligodendrocyte precursor cells (Gil and Hanna et al., Glia, 2025, PMID: 41420072)

      The loss of MIC60 and OPA1 should repress electron transport chain function, are such impacts observed in the cultured cells? This could be shown by assessing oxygen consumption, etc. Such data would enhance the authors' conclusion that MCL-1 inhibition leads to defects in mitochondrial physiology*. *

      We completely agree with this comment by Reviewer 1. In our revision, we will include an assessment of mitochondrial oxygen consumption rate, using the Seahorse analyzer (mitochondrial stress test), of NPCs treated with MCL-1i. Preliminary data (n=3) are currently presented as Figure 3 for reviewers. Interestingly, these data show a more nuanced cellular response. Consistent with our conclusion that MCL-1 inhibition does not cause apoptotic cell death, MCL-1i did not affect mitochondrial respiration at baseline. The specific deficits appear in spare respiratory capacity and maximal respiration, meaning cells can sustain routine mitochondrial function but lose the ability to respond to increased energetic demand. This suggests MCL-1 loss creates a mitochondrial reserve deficiency rather than a generalized bioenergetic failure. The results with caspase inhibitors show a near-zero OCR across both 24h and 48h timepoints, and significant reductions in maximal respiration, spare respiratory capacity, and non-mitochondrial OCR. Remarkably, these conditions are not detrimental to newborn neurons, as shown in Figure 7. This is very interesting because it suggests that, under severe bioenergetic failure, neural stem cells (PAX6+) can differentiate into newborn neurons in a TBR2-independent manner. More relevant to this study, our results unequivocally demonstrate that TBR2-positive cells depend on the non-apoptotic function of MCL-1

      In Figure 4, the differences between transcripts (qPCR data) and protein (immunoblot) data are often confusing and not well explained. Why do the authors propose that mRNA expression is decreasing whereas the protein expression is increasing? Example CPT1. Furthermore, it is unclear what these data mean functionally? Is this reflective of enhanced lipid oxidation or simply a response to inhibition of fatty acid oxidation? Clarification of the impact of these findings is necessary.

      We agree with Reviewer 1 that the results could be hard to interpret. However, the effects of MCL-1 inhibitors on the transcription of fatty acid oxidation genes have been widely cited by the work of Opferman and Walensky (PMID: 36198266). We speculate that the effects on transcription are triggered by mitochondrial signaling. The mechanistic insight into this phenomenon would be an interesting next step.

      In the case of CPT1, we addressed this comment and found that the difference is due to differential expression of isoforms The RT-qPCR shown in Figure 4, is on CPT1c, whereas the western blot is on CPT1a. Unfortunately, after trying several products, we determined that there are no good antibodies for CPT1c. Thus, since we can’t compare gene and protein expression, we will include CPT1a RT-qPCR data to complement the western blot.

      The increase in lipid droplet number induced by MCL-1 inhibition has been previously documented, but it is unclear whether this increase is related to an inability to oxidize lipid (defective fatty acid oxidation) that leads to increases in the cellular abundance or whether this indicates that MCL-1 inhibition leads to enhanced storage. Do other inhibitors of fatty acid oxidation lead to similar increases in lipid droplet size and abundance? Does QVD inhibition affect this phenotype?

      This is a great point raised by Reviewer 1, and one we have also wondered about. We conducted an experiment using C16 BODIPY to address this point (Figure 6 for Reviewers). We observed no changes in C16 lipid droplet accumulation in count, volume, or surface area when cells were treated with MCL-1 inhibitor for 24 hours total with or without a starvation period in the last 6 hours of treatment. However, we observed significant pan-lipid droplet accumulation in the same conditions. This contrast suggests that FAO of exogenous LC-fatty acids is not reliant on MCL-1. This finding does not discount from the requirement of MCL-1 for other FAO processes especially given the major limitation of how much C16 BODIPY (fluorescent palmitate) can be administered to the cells (10µM) which was 10-fold less than what we exogenously supplied to the cells for the pan-BODIPY experiment (100µM, see Figure 5). It is entirely possible that this small dose was not enough to detect any lipid droplet accumulation.

      We have now also included experiments using etomoxir and perhexiline to assess their effects on TBR2/PAX6 (Figure 2 for reviewers). The results indicate that inhibiting the FAO pathway does not fully mimic the effects of MCL-1i on TBR2. However, we show that MCL-1i displaces ACSL1 from the mitochondria, a step that is upstream of CPT1/2. We suggest a model in which the coordinated non-apoptotic function of MCL-1 at the outer mitochondrial membrane promotes ACSL1 activity and, in the inner mitochondrial membrane, regulates mitochondrial cristae morphology. While our data point to this model, we are limited by the tools to investigate it further, but it will be a great direction for future experiments.

      For Figure 6, while these data may be very meaningful, as presented they are very hard to appreciate. Insets that show the neuronal populations would help to convey the point that the differentiation is impacted. Also, are there other methods that could confirm these observations (qPCR to show changes in differentiation).

      We agree with Reviewer 1. In the new version of the manuscript, we will include panels that zoom into the cell populations we quantified. The current panels will go to a new Supplemental figure. We will also add the TUBB3 to the qPCR panel in the new version.

      Figure 7 is also very hard to appreciate. What is the reader to see? Can these be quantified? It seems that QVD may be rescuing in this figure, does this suggest that MCL-1 inhibition might be inducing death. All of this needs to be quantified.

      We will provide quantification of BIII tubulin branching, and it will be included next to the images provided.

      BCL-XL has also been implicated in affecting mitochondrial electron transport chain function (See PMID: 19255249, 21926988, 21987637). Can BCL-XL inhibitors affect any of the phenotypes associated here?

      We will include experiments to test the effect of BCL-2 and BCL-XL inhibitors on TBR2 cells to address this comment.

      Please be carefully avoid using the term "MCL-1 loss", when talking about pharmacological inhibition. Only genetic ablation (e.g. knockout, silencing, etc.) should be termed loss.

      We have now removed the reference to MCL-1 loss in line 199.

      __*Reviewer #1 (Significance (Required)):

      The study advances in human cells the impacts of MCL-1 inhibition. They replicate many impacts previously observed in mouse systems and refine analyses to impacts on MICOS complex, lipid droplet storage, and neuronal differentiation. While these findings are important and would be well received by a wide audience, the study fails to provide almost any mechanistic insight into how these phenotypes are being induced. The only common theme is that blocking caspase activation in many assays fails to block the phenotype.

      *__

      __Reviewer #2_ (Evidence, reproducibility and clarity (Required)): _*

      Summary: This manuscript by Hanna et al. investigates non-apoptotic roles of MCL-1 in human neural stem cells and connects MCL-1 inhibition to mitochondrial cristae formation and beta-oxidation. Connecting these roles to brain development, the authors also show a reduction in the number of progenitor cells upon MCL-1 inhibition, independently of caspase activity. Throughout their work, the authors make use of an impressive array of imaging techniques. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. *__

      We thank Reviewer 2 for the thoughtful and positive assessment of our manuscript. We appreciate the reviewer’s recognition that our study reveals non-apoptotic roles of MCL-1 in human neural stem cells. We are also grateful for the acknowledgment of the imaging approaches employed, which allowed us to connect MCL-1 function to cristae architecture with multiple complementary techniques. We acknowledge the reviewer’s point that the mechanistic basis by which MCL-1 influences cristae structure remains insufficiently defined. In the revised manuscript, we will clarify the limitations of the current data, expand our discussion of potential mechanisms, and incorporate additional analyses to identify downstream effectors that mediate these structural and metabolic changes.

      Major comments:

      - In Fig. 1B, the very same representative images are shown for both conditions (DMSO and S63845) at 48 hours.

      We deeply appreciate Reviewer 2 for catching this unintentional duplication that occurred during figure preparation. We have now corrected this issue.

      - For Western Blot analysis, it looks like the authors only quantified the band density of their proteins of interest without considering varying levels of control protein (Actin) levels. Normalizing the protein levels to actin would account for any differences in loaded protein amounts (although a Ponceau staining might be preferable still to exclude this). This is especially relevant for Fig. 4E, where actin levels visibly differ between the conditions.

      All WB quantifications were normalized to Actin (this detail is now added to the y-axis of all band density graphs and figure legends). In addition, we will transform the data to a logarithmic scale to “normalize” for gel-to-gel variability.

      - The authors offer evidence that MCL-1 inhibition impedes proteolytic cleavage of OPA1-L into the OPA-1-S isoforms, yet do not explore the mechanism behind this. Since OPA1 is cleaved by both OMA1 and YME1L, determination of the levels of these proteases could help shed some light on the mechanism leading to cristae reorganization.

      We will follow up on Reviewer 2's comment with a WB analysis of OMA1 and YMEL in cells treated with an MCL-1 inhibitor.

      - Generally speaking, while the authors show all those effects (cristae defects, FAO dysfunction) upon MCL-1 inhibition, it would be interesting to see whether any of those effects can be rescued by blocking FA import e.g. through carnitine palmitoyl- transferase 1a (CPT1a) inhibition with etomoxir to understand if they are downstream of altered Fa supply. This could affect cristae morphology through altered Cardiolipin biogenesis.

      This is an excellent point, which was also raised by reviewer 1. We have now included experiments using etomoxir and perhexiline to assess their effects on TBR2/PAX6 (Figure 2 for Reviewers). As mentioned above, the results indicate that inhibiting the FAO pathway does not fully mimic the effects of MCL-1i on TBR2. However, we show that MCL-1i displaces ACSL1 from the mitochondria, a step that is upstream of CPT1 and 2. We suggest a model in which the coordinated non-apoptotic function of MCL-1 at the outer mitochondrial membrane promotes ACSL1 activity and, in the inner mitochondrial membrane, regulates mitochondrial cristae morphology. While our data point to this model, we are limited by the tools to investigate it further, but it will be a great direction for future experiments. The suggestion of Reviewer 2 that the effects on FAO could impact cardiolipin biogenesis is a very exciting possibility. However, difficult to test with the tools available.

      - In line 262 the authors discuss that mitochondria lose metabolic function upon MCL-1 inhibition. This claim would require additional experiments. While the authors look at lipid droplet accumulation and FAO enzymes, there are many more aspects to mitochondrial metabolic function that should be investigated. While measuring the oxygen consumption rate via Seahorse might require additional resources (optional), measurements of ATP production, ROS generation or determination of the mitochondrial membrane potential should be feasible.

      We fully agree with Reviewer 2's comment, which was also raised by Reviewer 1. In our revision, we will include an assessment of the mitochondrial oxygen consumption rate of NPCs treated with MCL-1i, measured using the Seahorse analyzer (mitochondrial stress test). These data are presented as Figure 3 for reviewers. Interestingly, these data show a more nuanced cellular response. While MCL-1i does not globally collapse mitochondrial respiration at baseline, the specific deficits appear in spare respiratory capacity and maximal respiration, meaning cells can sustain routine mitochondrial function but lose the ability to respond to increased energetic demand. This suggests MCL-1 loss creates a mitochondrial reserve deficiency rather than a generalized bioenergetic failure. The results with caspase inhibitors show a near-zero OCR across both 24h and 48h timepoints, and significant reductions in maximal respiration, spare respiratory capacity, and non-mitochondrial OCR. These conditions are detrimental for TBR2-positive NPCs (Figure 6) , but not for newborn neurons (Figure 7).

      - While the authors "propose a model in which MCL-1 associates with MICOS", they do not offer direct scientific to support this hypothesis. Co-immunoprecipitation experiments or e.g. proximity ligation assays would better support the proposed model.

      We agree with this statement. Preliminary, we have performed proximity ligation assays and immunoprecipitation analyses to test for this interaction (see below and ____Figure 4 for reviewers), and the results indicate an interaction, albeit very weak. In the revised version of the manuscript, we will attempt to repeat these experiments with MCL-1i.

      - While Fig. 7 shows representative images, quantification e.g. for the truncation of neuronal processes is missing.

      We will provide quantification of BIII tubulin branching, which will be included alongside the images provided.

      - In lines 219f. the authors state that they "observed a significant downregulation of PAX6 and EOMES at 24 hours that was not rescued by QVD co-treatment". While there is still a trend towards a downregulation, there is no statistical significance anymore. In fact, PAX6 levels almost mirror those of SOX2 which is not described as "downregulated" by the authors. In order to be more consistent, I would suggest rephrasing this part, or at least reword it to be less absolute.

      In the new version, we will clarify that while QVD rescued TBR2 and PAX6 transcript levels at 24h, it did not rescue them at 48h. We will also mention the downregulation of SOX2 at 48h that persists with co-treatment.

      - Brinkmann et al. (2025) also investigated cristae structure upon MCL-1 deletion in vivo and found no effect when MCL-1 was replaced with other Bcl-2 family members. It would be interesting to combine MCL-1 inhibition with overexpression of MCL-1 versus BCL-XL to reconsolidate some of the discrepant findings.

      While this is a great suggestion for future studies, there are some complications. Specifically, it is likely that the inhibitor may also target the overexpressed MCL-1 and thus, a mutant form is needed.

      To address this, we generated a Flag-tagged MCL-1 construct with a mutated BH3 domain, previously described by Kotschy et al. Nature 2016. We validated the construct in HeLa cells, but unfortunately the mutant protein appears to be significantly less stable than the WT construct, complicating analysis of this experiment.

      Minor comments:

      - In Supp. Fig. 1C the MCL-1 protein is shown both to run above 37kDa (upper panel) and below 37 kDa (lower panel). Could the authors please comment on why this is the case?

      The observed variation is caused by drift in the gel during electrophoresis. In Fig 1C, the protein ladder is on the edge of the gel, whereas in Fig 1E, the protein ladder is in the middle of the gel, and the last sample is on the edge and also exhibits edge drift.

      - In line 64 of the introduction the authors mention clinical trials yet do not give a citation for these trials making it hard to judge whether the content of these trials is actually related to the brain.

      This information is anecdotal, based on an Amgen press release.

      - MCL-1 as well as ACSL-1 are sometimes written without the hyphen both in the text and figures.

      We will carefully check the manuscript before submission.

      - Lines 92-94 and 106-108 essentially highlight the same existing knowledge gap. Maybe the content of these two paragraphs could be combined in order to avoid repetition.

      We thank Reviewer 2 for this suggestion. We will do this in the new version of the manuscript.

      - In Fig. 1A, the authors provide a schematic for their experimental design. While the figure legend is very thorough, some of this information (like the days of collection) could also be included in the figure itself. The same is true for schematics in the following figures.

      We agree with this and will incorporate the suggestion in the new version.

      - Fig. 2A includes a typo (analyze) but would maybe also be more suitable for the supplement figures or could even be combined with Fig. 1A as not much new content is added.

      We already incorporated these changes in the new version of the manuscript.

      - Regarding statistical analysis, could the authors please comment on why they did not consider one-sample t-tests suitable for the cases where control values were set at 1 (e.g. Fig. 4B, C for the relative expression).

      This is a valid suggestion. We will rerun RT-qPCR data using a one-sample t-test.

      - In lines 247f. the authors state that "inhibition of MCL-1 leads to [...] and disassembly of the MICOS complex as well as OPA1". This sounds like OPA1 is still cleaved upon MCL-1, which is not at all what the authors showed and further discuss. Rewording of the sentence would help in avoiding any misunderstandings.

      We agree with this comment and have now reworded the paragraph: “Inhibition of MCL-1 leads to structural collapse of the cristae likely due to the possible disassembly of the MICOS complex, as suggested by decreased MIC10 levels, and interruption of OPA1 cleavage, as suggested by decreased short-form OPA1, two scaffolds required for cristae maintenance.”

      - In lines 210f. the authors state that "quantitative imaging increased the average and maximum volume of lipid droplets". While there is definitely a trend towards an increase for the maximum volume, the increase is in fact not statistically significant. This should be reflected in the wording.

      We have reworded this to “Quantitative imaging revealed a significant increase in average lipid droplet volume and a trending increase in maximum volume of lipid droplets.”

      - In Fig. 6 the overlap between TBR2 and PAX6 is hard to judge when printed out. Including a zoom-in may make it easier to judge.

      We agree with Reviewer 2. In the new version of the manuscript, we will include panels that zoom into the cell populations we quantified. The current panels will go to a new Supplemental figure. We will also add the TUBB3 to the qPCR panel in the new version.

      - In Fig. 7 the color-coding is listed in the figure legend but is missing from the figure itself. If the authors could include this, as they did for the other figures, it would further improve this figure.

      We agree. We have specified the channel color in the new figure.

      - Line 238 should reference Fig. 7A, as Fig 7B does not exist.

      Thanks for catching this. It is already corrected

      - In the figure legends the authors state that biological replicates were used. Were technical replicates also performed?

      Yes, technical replicates were performed for RT-qPCR.

      Reviewer #2 (Significance (Required)):____ Significance

      The authors make use of a wide array of imaging techniques to further elucidate non-apoptotic roles of MCL-1. The study has the potential to offer new insights into mitochondrial biology on the level of basic research rather than translational. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. Nevertheless, the study offers additional knowledge on the role of MCL-1 in human neural stem cells, whereas previous research mostly focused on cardiomyocytes or cancer cells.

      Reviewer #3____ (Evidence, reproducibility and clarity (Required)):

      Summary: ____ In this study, Gama et al. describe a non-canonical role for the anti-apoptotic protein Myeloid Cell Leukemia-1 (MCL-1) in mitochondrial cristae organization and suggest a role of MCL-1 in regulating metabolism and neuronal differentiation. Using fluorescence microscopy imaging and electron microscopy, the authors show changes to mitochondrial morphology upon treatment with MCL-1 inhibitor S63845. MCL-1 inhibition results in altered protein and transcript levels of some key proteins involved in mitochondrial cristae organization and fatty acid metabolism. While some of the findings are interesting and indeed point towards a non-canonical role of MCL-1, several key conclusions of the authors are not sufficiently supported by the data shown in the manuscript.

      We thank Reviewer 3 for the careful evaluation of our manuscript. We appreciate the reviewer’s recognition that our study identifies a potential non-canonical role for MCL-1 in mitochondrial cristae organization, metabolism, and neuronal differentiation. As with Reviews 1 and 2, we are encouraged that the reviewer finds these observations interesting and suggestive of previously unappreciated functions for MCL-1. We agree that stronger evidence is required to firmly link MCL-1 inhibition to specific changes in MICOS organization and metabolic regulation. In the revised manuscript, we will (i) more clearly distinguish between observations and mechanistic inferences, (ii) temper conclusions where appropriate, and (iii) incorporate additional analyses and controls to better substantiate the proposed model.

      Major comments:

      1. The authors try to disentangle the apoptotic and non-apoptotic role of MCL-1 through addition of a caspase inhibitor. However, I am not convinced that phenotypes found under the addition of caspase inhibitor are necessarily caused by non-canonical functions independent of apoptosis. It could also be that the observed changes happen upstream of caspase activation. In addition, many of the described finding, such as CPT1 expression changes, only happen in the presence of the caspase inhibitor. If one follows the logic of the authors, changes associated by non-canonical MCL-1 functions should happen under MCL-1 inhibition and caspase inhibition, but not with MCL-1 inhibition only____. __ The reviewer is right that we expected non-canonical functions to happen under MCL-1 inhibition and caspase inhibition. Our data with QVD shows that the cell death function of MCL-1 (i.e., inhibiting cell death effectors from initiating the caspase cascade) is not the main trigger of the phenotypes we report (cristae dysregulation and fatty acid oxidation disruption), however, cells without a functional cristae and/or defects in FAO, may not be able to survive long-term. Thus, QVD treatment preserves these cells that may not survive the dismantling of such an essential structure. To confirm this, we performed immunofluorescence of cleaved caspase 3 (__Figure 5 for reviewers). These results show that, indeed, MCL-1 inhibition at the time points of our study doesn’t result in increased Caspase-3 activation. We reported similar results of MCL-1 inhibition in oligodendrocyte precursor cells (Gil and Hanna et al., Glia, 2025, PMID: 41420072).

      The authors show no data on the viability of the cells in response to the MCL-1 inhibitor. To exclude secondary effects of the inhibitor, at least some of the results should be validated with an MCL-1 knock down.

      We will include this experiment in our revised manuscript. To check the effects of MCL-1 knockdown on TBR2 positive cells, we tested 5 different ASOs for MCL-1. Knockdown efficiency with ASOs was very low (on average In Figure 1, the authors show immunofluorescence data of mitochondria and nucleus staining and conclude that MCL-1 inhibition alters mitochondrial morphology. Based on the images shown in Fig. 1, I do not think that individual mitochondria can be segmentd to measure their volume and length. In addition, some metrics such as mitochondrial content are not explained in the text or methods.

      We can achieve mitochondrial segmentation with a SoRa Spinning Disk Confocal Microscope, which has a lateral (XY) resolution of approximately 120 nm to 150 nm and an axial (Z) resolution of approximately 300 nm–320 nm. All images are first denoised prior to sharpening using the Richardson-Lucy deconvolution algorithm. Additionally, the FIB-SEM data are consistent with the IF data (both show increase in mitochondrial volume and surface area).

      We agree with Reviewer 3 that we need to explain some metrics in the revised version. We will specify the meaning of mitochondrial content (count of all mitochondria in FOV, not normalized to Hoechst).

      In Fig. 2 B-D, the authors show TEM and FIB-SEM imaging to demonstrate alterations in the cristae architecture upon treatment with MCL-1 inhibitor. However, based on the images shown, it looks that cristae area and density is reduced under S63845 treatment in TEM images, while the FIB-SEM data come to the opposite conclusion. In addition, the quantification of cristae volume quantified as cristae volume in percentage is unclear to me.

      We apologize for the confusion. No conclusions about the cristae area and density were made using the TEM data, because TEM data represent a single snapshot section of a mitochondrion without a discernible orientation. Cristae from TEM were described as “aberrant” and preliminarily revealed changes in cristae and were followed up with FIB-SEM, 3D reconstruction of intact mitochondria, and quantification of volume.

      In the new version of the manuscript, we will specify that the cristae volume is normalized to the volume of its respective mitochondria (i.e., how much of the mitochondrial volume is attributed to cristae).

      The change in CPT1/2 protein levels (Fig. 4) is interesting but does not directly proof that fatty acid oxidation is altered, as concluded by the authors. For this, the authors would need to directly measure fatty acid oxidation for example using Seahorse or metabolic tracing experiments. Also, to prove that the MCL-1 inhibition affects neural differentiation through fatty acid oxidation, a rescue experiment should be performed through CPT1 overexpression.

      We agreed that this is an important point. We have optimized the fatty acid oxidation test using Seahorse and will make sure to include it in the revised version of the manuscript.

      In Figure 6, the authors show decreased intermediate progenitor cells after MCL-1 inhibition by immunofluorescence staining. I am not convinced that this can be concluded from the data shown, since the concentration of intermediate progenitor cells is very close to the noise levels. Since the MCL-1 treated cells look much less sparse, I don't think the percentages can be compared (total counts are between 2-20). Although this data might give some indication that differentiation could be impaired, the measured effect could be very well due to lower viability of the cells. The authors need to control for this or come up with a different method for measuring differentiation.

      The number of TBR2 is low, but we disagree with the reviewer’s assessment of noise levels. We focused on cells expressing only TBR2 and rigorously examined this population of cells. The percentages are compared to account for the lower density of the MCL-1i-treated cultures, as the IPC counts are normalized to the Hoechst total cell count within the FOV. Moreover, the immunofluorescence images are complemented with RT-qPCR, which shows significant downregulation of EOMES (gene encoding TBR2).

      Figure 7 is missing quantification

      We will include this quantification in the revised version of the manuscript.

      Reviewer #3 (Significance (Required)):

      General assessment____: The manuscript reports an interesting finding, which suggest a non-canonical role of MCL-1 in mitochondrial remodeling, regulation of fatty acid oxidation and neuronal fate. While this finding would be highly interesting and relevant, the presented data do not sufficiently support this conclusion. Further experiments would have to be performed to proof causality. ____ Advance: Should the authors manage to proof their hypothesis by additional experiments, this would indeed advance the field on mitochondrial remodeling and its effect on neuronal differentiation by

      identifying a novel molecular player. ____ Audience: mitochondrial biology, cell biology, developmental neuroscience Own expertise: mitochondrial biology, cell biology, advanced imaging techniques

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this study, Gama et al. describe a non-canonical role for the anti-apoptotic protein Myeloid Cell Leukemia-1 (MCL1) in mitochondrial cristae organization and suggest a role of MCL1 in regulating metabolism and neuronal differentiation. Using fluorescence microscopy imaging and electron microscopy, the authors show changes to mitochondrial morphology upon treatment with MCL1 inhibitor S63845. MCL1 inhibition results in altered protein and transcript levels of some key proteins involved in mitochondrial cristae organization and fatty acid metabolism. While some of the findings are interesting and indeed point towards a non-canonical role of MCL1, several key conclusions of the authors are not sufficiently supported by the data shown in the manuscript.

      Major comments:

      1. The authors try to disentangle the apoptotic and non-apoptotic role of MCL1 through addition of a caspase inhibitor. However, I am not convinced that phenotypes found under the addition of caspase inhibitor are necessarily caused by non-canonical functions independent of apoptosis. It could also be that the observed changes happen upstream of caspase activation. In addition, many of the described finding, such as CPT1 expression changes, only happen in the presence of the caspase inhibitor. If one follows the logic of the authors, changes associated by non-canonical MCL1 functions should happen under MCL1 inhibition and caspase inhibition, but not with MCL1 inhibition only.
      2. The authors show no data on the viability of the cells in response to the MCL1 inhibitor. To exclude secondary effects of the inhibitor, at least some of the results should be validated with an MCL1 knock down.
      3. In Figure 1, the authors show immunofluorescence data of mitochondria and nucleus staining and conclude that MCL1 inhibition alters mitochondrial morphology. Based on the images shown in Fig. 1, I do not think that individual mitochondria can be segmentd to measure their volume and length. In addition, some metrics such as mitochondrial content are not explained in the text or methods.
      4. In Fig. 2 B-D, the authors show TEM and FIB-SEM imaging to demonstrate alterations in the cristae architecture upon treatment with MCL1 inhibitor. However, based on the images shown, it looks that cristae area and density is reduced under S63845 treatment in TEM images, while the FIB-SEM data come to the opposite conclusion. In addition, the quantification of cristae volume quantified as cristae volume in percentage is unclear to me.
      5. The change in CPT1/2 protein levels (Fig. 4) is interesting but does not directly proof that fatty acid oxidation is altered, as concluded by the authors. For this, the authors would need to directly measure fatty acid oxidation for example using Seahorse or metabolic tracing experiments. Also, to prove that the MCL1 inhibition affects neural differentiation through fatty acid oxidation, a rescue experiment should be performed through CPT1 overexpression.
      6. In Figure 6, the authors show decreased intermediate progenitor cells after MCL1 inhibition by immunofluorescence staining. I am not convinced that this can be concluded from the data shown, since the concentration of intermediate progenitor cells is very close to the noise levels. Since the MCL1 treated cells look much less sparse, I don't think the percentages can be compared (total counts are between 2-20). Although this data might give some indication that differentiation could be impaired, the measured effect could be very well due to lower viability of the cells. The authors need to control for this or come up with a different method for measuring differentiation.
      7. Figure 7 is missing quantification

      Significance

      General assessment: The manuscript reports an interesting finding, which suggest a non-canonical role of MCL1 in mitochondrial remodeling, regulation of fatty acid oxidation and neuronal fate. While this finding would be highly interesting and relevant, the presented data do not sufficiently support this conclusion. Further experiments would have to be performed to proof causality.

      Advance: Should the authors manage to proof their hypothesis by additional experiments, this would indeed advance the field on mitochondrial remodeling and its effect on neuronal differentiation by identifying a novel molecular player.

      Audience: mitochondrial biology, cell biology, developmental neuroscience

      Own expertise: mitochondrial biology, cell biology, advanced imaging techniques

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Hanna et al. investigates non-apoptotic roles of MCL-1 in human neural stem cells and connects MCL-1 inhibition to mitochondrial cristae formation and beta-oxidation. Connecting these roles to brain development, the authors also show a reduction in the number of progenitor cells upon MCL-1 inhibition, independently of caspase activity. Throughout their work, the authors make use of an impressive array of imaging techniques.While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored.

      Major comments:

      • In Fig. 1B, the very same representative images are shown for both conditions (DMSO and S63845) at 48 hours.
      • For Western Blot analysis, it looks like the authors only quantified the band density of their proteins of interest without considering varying levels of control protein (Actin) levels. Normalizing the protein levels to actin would account for any differences in loaded protein amounts (although a Ponceau staining might be preferable still to exclude this). This is especially relevant for Fig. 4E, where actin levels visibly differ between the conditions.
      • The authors offer evidence that MCL-1 inhibition impedes proteolytic cleavage of OPA1-L into the OPA-1-S isoforms, yet do not explore the mechanism behind this. Since OPA1 is cleaved by both OMA1 and YME1L, determination of the levels of these proteases could help shed some light on the mechanism leading to cristae reorganization.
      • Generally speaking, while the authors show all those effects (cristae defects, FAO dysfunction) upon MCL-1 inhibition, it would be interesting to see whether any of those effects can be rescued by blocking FA import e.g. through carnitine palmitoyl- transferase 1a (CPT1a) inhibition with etomoxir to understand if they are downstream of altered Fa supply. This could affect cristae morphology through altered Cardiolipin biogenesis.
      • In line 262 the authors discuss that mitochondria lose metabolic function upon MCL-1 inhibition. This claim would require additional experiments. While the authors look at lipid droplet accumulation and FAO enzymes, there are many more aspects to mitochondrial metabolic function that should be investigated. While measuring the oxygen consumption rate via Seahorse might require additional resources (optional), measurements of ATP production, ROS generation or determination of the mitochondrial membrane potential should be feasible.
      • While the authors "propose a model in which MCL-1 associates with MICOS", they do not offer direct scientific to support this hypothesis. Co-immunoprecipitation experiments or e.g. proximity ligation assays would better support the proposed model.
      • While Fig. 7 shows representative images, quantification e.g. for the truncation of neuronal processes is missing.
      • In lines 219f. the authors state that they "observed a significant downregulation of PAX6 and EOMES at 24 hours that was not rescued by QVD co-treatment". While there is still a trend towards a downregulation, there is no statistical significance anymore. In fact, PAX6 levels almost mirror those of SOX2 which is not described as "downregulated" by the authors. In order to be more consistent, I would suggest rephrasing this part, or at least reword it to be less absolute.
      • Brinkmann et al. (2025) also investigated cristae structure upon MCL-1 deletion in vivo and found no effect when MCL-1 was replaced with other Bcl-2 family members. It would be interesting to combine MCL-1 inhibition with overexpression of MCL-1 versus BCL-XL to reconsolidate some of the discrepant findings.

      Minor comments:

      • In Supp. Fig. 1C the MCL-1 protein is shown both to run above 37kDa (upper panel) and below 37 kDa (lower panel). Could the authors please comment on why this is the case?
      • In line 64 of the introduction the authors mention clinical trials yet do not give a citation for these trials making it hard to judge whether the content of these trials is actually related to the brain.
      • MCL-1 as well as ACSL-1 are sometimes written without the hyphen both in the text and figures.
      • Lines 92-94 and 106-108 essentially highlight the same existing knowledge gap. Maybe the content of these two paragraphs could be combined in order to avoid repetition.
      • In Fig. 1A, the authors provide a schematic for their experimental design. While the figure legend is very thorough, some of this information (like the days of collection) could also be included in the figure itself. The same is true for schematics in the following figures.
      • Fig. 2A includes a typo (analyze) but would maybe also be more suitable for the supplement figures or could even be combined with Fig. 1A as not much new content is added.
      • Regarding statistical analysis, could the authors please comment on why they did not consider one-sample t-tests suitable for the cases where control values were set at 1 (e.g. Fig. 4B, C for the relative expression).
      • In lines 247f. the authors state that "inhibition of MCL-1 leads to [...] and disassembly of the MICOS complex as well as OPA1". This sounds like OPA1 is still cleaved upon MCL-1, which is not at all what the authors showed and further discuss. Rewording of the sentence would help in avoiding any misunderstandings.
      • In lines 210f. the authors state that "quantitative imaging increased the average and maximum volume of lipid droplets". While there is definitely a trend towards an increase for the maximum volume, the increase is in fact not statistically significant. This should be reflected in the wording.
      • In Fig. 6 the overlap between TBR2 and PAX6 is hard to judge when printed out. Including a zoom-in may make it easier to judge.
      • In Fig. 7 the color-coding is listed in the figure legend but is missing from the figure itself. If the authors could include this, as they did for the other figures, it would further improve this figure.
      • Line 238 should reference Fig. 7A, as Fig 7B does not exist.
      • In the figure legends the authors state that biological replicates were used. Were technical replicates also performed?

      Significance

      The authors make use of a wide array of imaging techniques to further elucidate non-apoptotic roles of MCL-1. The study has the potential to offer new insights into mitochondrial biology on the level of basic research rather than translational. While the methods used offer sufficient evidence to connect MCL-1 inhibition to cristae architecture, the mechanistic underpinnings of this effect remain unexplored. Nevertheless, the study offers additional knowledge on the role of MCL-1 in human neural stem cells, whereas previous research mostly focused on cardiomyocytes or cancer cells.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This study claims that beyond its canonical anti-apoptotic function, MCL-1 has essential non-apoptotic roles in human neurodevelopment. Pharmacologic inhibition of MCL-1 in human neural stem cells disrupts mitochondrial inner membrane architecture by destabilizing cristae and the OPA1-MICOS complex, leading to swollen mitochondria with disorganized cristae. These structural defects impair fatty acid oxidation and lipid droplet homeostasis, linking cristae integrity to metabolic competence. Independently of apoptosis or proliferation, MCL-1 inhibition selectively depletes intermediate neural progenitors, indicating a direct role in lineage progression. Overall, the work positions MCL-1 as a key regulator of mitochondrial structure-metabolism coupling that instructs neural progenitor identity and human neurogenesis.

      Overall: The study does a good job of using (in most assays) caspase inhibition (e.g., QVD treatment) to block apoptotic responses induced by MCL-1 inhibition. As a result, many of the phenotypes caused by inhibition are likely to be independent of caspase activation. As a result, this manuscript would be of interest to researchers that study the topics of the BCL-2 family and cell death signaling, mitochondrial bioenergetics and dynamics, neurodevelopment, and cellular metabolism. However, as currently presented the manuscript is only descriptive and lacks mechanistic insight.

      Major Concerns:

      1) The authors only use a single MCL-1 inhibitor and never use other non-targeting BH3-mimetics (such as venetoclax) as negative controls. This seems like a missed opportunity to demonstrate that the phenotypes observed are MCL-1 dependent.

      2) There is no mechanism proposed in this study other than reliance upon QVD as not affecting the phenotypes. As submitted, the manuscript only can speculate that these phenotypes are due to non-apoptotic roles of MCL-1 inhibition. The authors have missed an opportunity to explore MCL-1's non-apoptotic functions directly.

      Other concerns exist that weaken the impact of the study.

      1. Figure 1 should include the fact that QVD inhibition (shown in Sup Fig 2) does not obviate the phenotype induced by pharmacological inhibition of MCL-1 on mitochondrial morphology.
      2. Figure 2 would benefit from evidence that caspase inhibition does not repress the phenotype on mitochondrial cristae morphology (volume and area). Furthermore, the FIB-SEM data are very hard to appreciate as the size precludes visualization of individual mitochondria.
      3. Figure 3 reports that MIC60 and OPA1 appear to be downregulated in response to MCL-1 inhibition, but these appear to be more significant only when QVD is added. Why would the phenotype be obscured in the non-QVD setting (Fig. 2B&C). How does MCL-1 inhibition lead to changes in MIC60/MICOS/OPA1? This seems quite preliminary at this point.
      4. The loss of MIC60 and OPA1 should repress electron transport chain function, are such impacts observed in the cultured cells? This could be shown by assessing oxygen consumption, etc. Such data would enhance the authors' conclusion that MCL-1 inhibition leads to defects in mitochondrial physiology.
      5. In Figure 4, the differences between transcripts (qPCR data) and protein (immunoblot) data are often confusing and not well explained. Why do the authors propose that mRNA expression is decreasing whereas the protein expression is increasing? Example CPT1. Furthermore, it is unclear what these data mean functionally? Is this reflective of enhanced lipid oxidation or simply a response to inhibition of fatty acid oxidation? Clarification of the impact of these findings is necessary.
      6. The increase in lipid droplet number induced by MCL-1 inhibition has been previously documented, but it is unclear whether this increase is related to an inability to oxidize lipid (defective fatty acid oxidation) that leads to increases in the cellular abundance or whether this indicates that MCL-1 inhibition leads to enhanced storage. Do other inhibitors of fatty acid oxidation lead to similar increases in lipid droplet size and abundance? Does QVD inhibition affect this phenotype?
      7. For Figure 6, while these data may be very meaningful, as presented they are very hard to appreciate. Insets that show the neuronal populations would help to convey the point that the differentiation is impacted. Also, are there other methods that could confirm these observations (qPCR to show changes in differentiation).
      8. Figure 7 is also very hard to appreciate. What is the reader to see? Can these be quantified? It seems that QVD may be rescuing in this figure, does this suggest that MCL-1 inhibition might be inducing death. All of this needs to be quantified.
      9. BCL-XL has also been implicated in affecting mitochondrial electron transport chain function (See PMID: 19255249, 21926988, 21987637). Can BCL-XL inhibitors affect any of the phenotypes associated here?
      10. Please be carefully avoid using the term "MCL-1 loss", when talking about pharmacological inhibition. Only genetic ablation (e.g. knockout, silencing, etc.) should be termed loss.

      Significance

      The study advances in human cells the impacts of MCL-1 inhibition. They replicate many impacts previously observed in mouse systems and refine analyses to impacts on MICOS complex, lipid droplet storage, and neuronal differentiation. While these findings are important and would be well received by a wide audience, the study fails to provide almost any mechanistic insight into how these phenotypes are being induced. The only common theme is that blocking caspase activation in many assays fails to block the phenotype.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Point

      Summary

      Response

      1.1

      Overall, the study lacks well-controlled experiments comparing hypoxia induced by DMOG with hypoxia induced by 1% O₂ for assessing ERα occupancy throughout.

      To assess whether DMOG-induced changes in ERα occupancy reflect bona fide hypoxia, we measured ERα binding by ChIP-qPCR under 1% oxygen over 48 hours, compared to normoxic (21% oxygen) cells and input controls in matched cells at the GREB1 and TFF1 loci. Our findings demonstrate that 1% oxygen treatment recapitulates the ERα binding changes observed with DMOG, at the time points of our RNA-seq experiments.

      We have included these results in __Figure 1F __of the preliminary revision of the manuscript.

      1.2

      Lack of evidence for other co-transcription factors impact under hypoxia HIF's in Fig1.

      We thank the reviewer for this comment. We have clarified that motif enrichment analysis is included to characterise the sequence context of ERα binding sites and to confirm enrichment of known ER-associated motifs (e.g. EREs), rather than to infer functional involvement of additional transcription factors under hypoxia. Corresponding interpretative statements have been removed from the Results and restricted to the Discussion.

      1.3

      Lack of evidence for DMOG induce HIF protein expression in MCF7 cells.

      To confirm DMOG induces HIF-protein expression we have analysed HIF1α and HIF2α protein levels by western blot. We have included these in __Supplementary Figure S1A __within the preliminary revision to address this concern.

      1.4

      Figure 1: ATAC-seq was performed under 1% O₂, whereas ChIP-seq was conducted with DMOG treatment, making these conditions not directly comparable.

      We acknowledge that the ERα ChIP-seq (DMOG) and ATAC-seq datasets were generated under different conditions and are therefore not directly comparable. To address this, we have performed ChIP-qPCR under bona fide hypoxia (1% oxygen) at canonical ERα target loci (TFF1 and GREB1), demonstrating that the directionality of ERα binding changes observed with DMOG is recapitulated under physiological hypoxia. These data provide a direct comparison of ERα occupancy across conditions and support the use of DMOG as a proxy for hypoxia in our ChIP-seq experiments.

      If requested, we are willing to perform ATAC-seq at 16 h under 1% oxygen. However, because the original dataset was generated under 0.1% oxygen, and canonical ERα-bound sites show minimal accessibility changes under severe hypoxia, we anticipate limited additional insight from repeating this experiment.

      1.5a

      Figure S1: ERα ChIP lacks estradiol (E2) treatment in MCF7 cells with or without DMOG.

      The statement that the ERα ChIP samples lack estrogen treatment is incorrect. Estradiol was not an experimental variable and cells were intentionally maintained under estrogen-rich conditions to preserve tumour-relevant ERα activity.

      We have now clarified within the preliminary revision by stating that cells were routinely cultured in “estrogen-rich Dulbecco’s Modified Eagle Medium” in the methods section, and clarified the use of estrogen-rich conditions in the Figure S1 legend.

      1.5b

      The single-gene examples of DMOG effects shown in Fig. S1A are not significant.

      The peak illustrated in Figure S1A (now Figure S1D) __is intended to provide a visual confirmation of peak calling and enrichment patterns underlying the genome-wide redistribution observed in __Figure 1. The peak was called by the MACS2 pipeline (code available from https://doi.org/10.5281/zenodo.17221105) with a log10(q-value) = 268.5, which passes the MACS2 cut-off q

      1.6a

      Fig. S2 lacks 1% O₂ conditions,

      We wish to clarify that Figure S2 (now Figure S4) serves as quality control specifically for the DMOG-treated ChIP-seq dataset presented in Figure 1C. The purpose of the plot is to visualize unfiltered motif enrichment to confirm that the identified peaks represent bona fide ERα binding events within the DMOG condition. Motif enrichment under a 1% oxygen environment would not provide this validation. In all cases the ERE is the most significantly enriched motif.

      With respect to ERα binding under 1% oxygen, we have now assessed this via targeted ChIP-qPCR validation (Figure 1F).

      1.6b

      Fig. S3 lacks DMOG-induced HIF factor assessments.

      The DMOG-induced changes in HIF1α and HIF2α expression are shown in the__ Figure S1__ of this revision proposal and have been incorporated into the manuscript as part of the changes described in response 1.3.

      1.7a

      Figure S4: Estradiol (E2) treatment is missing from the controls, and the figure labeling is of poor quality.

      We have substantially improved the labelling of Figure S4, now__ Figure S6.__

      Additionally, we have clarified that all samples were cultured in estrogen-rich media and treated with either vehicle control or 100 nM fulvestrant; thus estrogen is present in all conditions including the controls.

      1.7b

      Hypoxic conditions for assessing ER status and appropriate controls are also lacking.

      We agree that monitoring ERα stability under hypoxic conditions is essential.

      We provided a western blot assessment of ERα protein levels at 0, 8 and 48 hours of treatment with 1% oxygen or DMOG, compared to normoxic controls, included as Supplementary Figures S1B, C in the preliminary revision.

      These demonstrate the cells remain positive for ERα protein expression at 0, 8 and 48h.

      1.8

      Figure S5: The description of fulvestrant treatments under hypoxic conditions is unclear.

      We thank the reviewer for this comment. To clarify the experimental design, we now signpost the reader in the figure legend of Figure S5 (now S7) to the schematic diagram provided in Figure 3B, and provide a summary stating the experiment employed a factorial design combining a 96-hour fulvestrant treatment with exposure to 1% oxygen for the final 48 hours.**

      1.9

      Supplemental legends: These require major revision; they are of poor quality and lack statistical details and references to biological replicates.

      We have extensively revised all supplementary figure legends to ensure clarity and precision.

      1.10

      Overall comparisons throughout the manuscript are weak; the figures appear sloppy and lack sufficient effort in presentation.

      Following this comment, we carefully reviewed the presentation of all figures throughout the manuscript. We improved the organisation and labelling of the Supplementary Figures to facilitate clearer comparison of the data. In particular, full western blots are now clearly annotated and supplementary legends have been expanded to provide sufficient context for each figure to be interpreted independently.

      1.11

      i) In general, the manuscript in its present form does not greatly contribute from published work as the ERα cistrone is well documented work studied for its role in regulating gene expression, particularly in ERα-positive breast cancer.

      ii) Additionally, a lack of a thorough comparison between DMOG and or 1 %oxygen induce hypoxia in the MCF7 ER+ model, diminished initial interest in the manuscript.

      iii) The lack of considering estradiol exposure under hypoxic conditions with either 1%oxygen and or DMOG also limits relevance to patients with ER+ BrCa.

      iv) The ERα epigenomic profile has been extensively studied including work under hypoxic conditions.

      i) We respectfully disagree that the manuscript does not extend prior work. Despite extensive characterisation of ERα, its role in shaping hypoxia-driven transcription in ER+ breast cancer has not been defined. Here, we identify an ERα-dependent hypoxic response (EDHR), demonstrating a reciprocal interaction between hypoxia and ERα activity.

      ii) In revision, we address concerns regarding DMOG by validating ERα binding under 1% oxygen using ChIP-qPCR thereby confirming our result in bona fide hypoxia. Additionally, all RNA-seq and functional assays, including ENaC targeting, were performed under 1% oxygen in the original manuscript.

      iii) All experiments were conducted under estrogen-complete conditions, now explicitly clarified, reflecting tumour-relevant ERα activity.

      iv) Together, these data establish a reciprocal interaction between ERα and hypoxia and uncover a targetable vulnerability in hypoxic ER+ breast cancer, linking transcriptional regulation to therapeutic opportunity.

      Reviewer 2

      No.

      Summary

      Response

      General Comments

      2.1

      ENAC is proposed as a therapeutic vulnerability based on amiloride sensitivity assays. Additional experiments are required, such as western blot validation of ENaC regulation under hypoxia and loss-of-function approaches to assess its contribution to the phenotype.

      We agree that further validation of ENaC involvement would strengthen this observation. We will assess ENaC protein levels under 1% hypoxia ± fulvestrant by western blot and perform siRNA-mediated depletion of ENaC subunits to test their contribution to the hypoxia-specific amiloride-sensitive phenotype by viability assay (see also response 3.3).

      2.2

      Fulvestrant is used to dissect ERa dependency. However, as a SERD, it may alter chromatin and transcription independently of a simple loss of ERα. Addition control would strengthen interpretation.

      The experimental design already controls for potential fulvestrant-specific transcriptional effects, as all four conditions (± hypoxia, ± fulvestrant) were included. EDHR genes were defined based on induction under hypoxia, loss of this induction following ERα degradation, and absence of residual hypoxic induction in the presence of fulvestrant. Consistent with this, SCNN1B and SCNN1G do not show significant fulvestrant-responsive changes under normoxia (Figure 5C,D).

      We also note that fulvestrant has been shown to induce minimal global chromatin remodelling (Guan et al., 2019), supporting its use to assess ERα dependency without broadly confounding chromatin accessibility; this reference is now included in the manuscript.

      2.3

      The molecular mechanism by which ERα modulates the hypoxic transcriptome, specifically how ERα and HIF pathways converge at ENAC loci should be more studied.

      We further examined the potential convergence of ERα and hypoxic signalling at the ENaC loci (included as __Figure 5E __in the revision proposal) showing genome browser views of the SCNN1G and SCNN1B loci, highlighting hypoxia-induced HIF1α binding and ERα association at these sites.

      To further support this, we will perform RT-qPCR validation of SCNN1G and SCNN1B expression following treatment ± IOX5 and ± fulvestrant. IOX5 is a selective PHD inhibitor that stabilises HIF proteins, enabling us to assess the contribution of HIF signalling independently of other oxygen-dependent effects associated with hypoxia.

      2.4

      In addition, to assess the relevance of this work for luminal breast cancer and ERα expression, specific validation in TNBC should be performed

      To assess the clinical relevance of SCNN1B and SCNN1G in ER-positive and ER-negative subgroups, we performed Cox proportional hazards analyses in TCGA and METABRIC cohorts individually, including ER status and stratifying by ER-positive and ER-negative cases (Figure 6C). These analyses support the association of SCNN1G with poorer relapse-free survival specifically in ER-positive patients.

      2.5

      The authors should provide RT-qPCR validation of the key EDHR genes, especially since this signature is later used for downstream analyses.

      We agree that independent validation would strengthen these findings. We will perform RT-qPCR validation of key EDHR genes (including SCNN1B and SCNN1G) under ± hypoxia and ± fulvestrant conditions to confirm ERα-dependent hypoxic induction.

      Limitations

      2.6

      Reprogramming of the ERα cistrome under cellular stress is well documented. The study extends these ideas but does not clearly demonstrate a new mechanistic paradigm, particularly because the EDHR is defined primarily through omics approaches without strong mechanistic validation. In addition, we have to keep in mind that the study uses DMOG to model hypoxia-driven chromatin changes, but DMOG inhibits many 2-oxoglutarate-dependent dioxygenases non-selectively.

      This makes it difficult to attribute ERα cistrome reprogramming specifically to hypoxia, rather than to broad off-target effects. The transcriptomic dataset is more convincing by need the validation suggested previously.

      While ERα cistrome reprogramming has been described, our study demonstrates a reciprocal interaction in which ERα not only responds to hypoxia but actively shapes hypoxia-driven transcription, defining an ERα-dependent hypoxic response (EDHR).

      We acknowledge the limitations of DMOG and have addressed this by validating key ERα binding events under bona fide hypoxia (1% oxygen) using ChIP–qPCR, confirming our findings under physiological conditions (response 1.1).

      To further strengthen mechanistic insight, we will assess the requirement for HIF stabilisation using the selective PHD inhibitor IOX5, combined with RT-qPCR analysis of SCNN1G and SCNN1B ± IOX5 ± fulvestrant (response 2.3 and 2.5). In addition, we will validate the functional relevance of ENaC through protein-level analysis and siRNA-mediated depletion, as described in__ response 2.1.__

      Together, these additions address concerns regarding DMOG specificity and provide further support for a functional interaction between ERα and hypoxic signalling.

      Audience

      2.7

      Given its reliance on omics datasets and preliminary functional assays, the paper will likely appeal to a specialized audience in transcriptional regulation, hypoxia signalling, and ER+ breast cancer biology. However, the limited mechanistic depth and uncertain translational relevance due to the lack of in vivo validation, may reduce its impact for broader oncology or therapeutic-development audiences. Without stronger validation, the findings may be perceived as niche and mainly of interest to researchers focused on ERα chromatin dynamics rather than to the wider cancer research community.

      The study incorporates multiple layers of human relevance, including spatial transcriptomic analyses demonstrating enrichment of EDHR within hypoxic tumour regions and survival analyses linking EDHR and ENaC expression to clinical outcome.

      In revision, we address the reviewer’s concerns through targeted validation (ChIP-qPCR in hypoxia, western blotting, and RT–qPCR). Together, these additions strengthen the mechanistic and translational relevance of the study.

      Reviewer 3

      No.

      Summary

      Response

      Major comments

      3.1

      The DMOG ChIP-seq provides a valuable first look at ERα redistribution. Since DMOG inhibits both HIF hydroxylases and oxygen-dependent demethylases, the driver of the observed changes remains ambiguous. It would help to include either ERα ChIP-seq under bona fide hypoxia or a selective PHD inhibitor condition (for example IOX5, as you discuss) to separate HIF stabilisation from broad demethylase inhibition. If ChIP-seq is not feasible, a brief ATAC validation at a small panel of gained and lost loci would still increase confidence.

      We acknowledge that mimetics of hypoxia can introduce off-target effects. To address this, we have validated our ERα ChIP-seq findings using ChIP-qPCR at representative loci (TFF1 and GREB1), demonstrating consistent changes in ERα binding under bona fide hypoxia (1% oxygen) (now included in Figure 1F).

      As acknowledged by the reviewer, ChIP-seq under these conditions is likely not feasible due to cell number constraints. We are willing to undertake ATAC-seq if required (as stated in response 1.1); however, we do not feel it would directly address ERα occupancy at these loci. We therefore consider our targeted ChIP-qPCR to be the most appropriate approach to validate ERα redistribution under hypoxia.

      3.2a

      The factorial RNA-seq is well designed and the attenuation analyses are clear. The EDHR selection is stringent and reproducible across two ER+ lines.

      To support the claim of ERα dependence mechanistically, a small number of targeted perturbations would go far. For example,

      i) confirm EDHR induction for SCNN1B and SCNN1G in hypoxia with and without fulvestrant by RT-qPCR

      We agree that targeted validation would strengthen the mechanistic support for ERα dependence. We will perform RT-qPCR validation of SCNN1B and SCNN1G under hypoxia ± fulvestrant to confirm ERα-dependent hypoxic induction (see also response 2.5).

      3.2b

      ii) test whether short-term ERα knockdown reproduces the effect.

      ERα dependency is already assessed through fulvestrant-mediated degradation within the factorial design, which provides a well-established and direct approach to evaluate ERα function. As EDHR genes are defined by loss of hypoxic induction following ERα degradation, this constitutes a robust assessment of ERα-dependent effects.

      We will therefore focus on orthogonal validation through RT-qPCR (response__ 2.5__), together with additional mechanistic and functional analyses using IOX5 and ENaC perturbation (responses 2.1 and 2.3), rather than introducing an ERα knockdown approach, although we would consider this if required.

      3.2c

      iii) A complementary test with a HIF-1α or HIF-2α knockdown at one time point would help position EDHR relative to HIF.

      This request aligns with point 2.3, which addresses the convergence of ERα and HIF signalling. While HIF knockdown under hypoxia would assess necessity, we will instead assess the contribution of HIF signalling using the selective PHD inhibitor IOX5, as this allows us to isolate HIF stabilisation from broader hypoxia-associated effects and avoids additional perturbation associated with transfection-based approaches. We will perform RT-qPCR analysis of SCNN1B and SCNN1G following treatment ± IOX5 ± fulvestrant to determine whether HIF stabilisation is sufficient to support ERα-dependent induction of EDHR genes.

      3.3

      The amiloride result is intriguing and consistent with a hypoxia-specific dependency. Because amiloride is pleiotropic, it would strengthen the conclusion to add one genetic and one pharmacological specificity control. A brief SCNN1B or SCNN1G knockdown in hypoxia should phenocopy the viability effect if ENaC contributes. In parallel, testing benzamil at sub-micromolar doses would provide a more ENaC-selective pharmacological readout. These can be performed in MCF7 and, resources permitting, in T47D.

      To address the reviewer’s concern regarding pleiotropic effects, we propose (aligning with our__ response to 2.1__) to apply siRNA-mediated knockdown of SCNN1B and SCNN1G under hypoxia to determine whether this reproduces our observed viability effect, thereby providing direct evidence for ENaC involvement.

      We agree that additional pharmacological validation could further support specificity, and would consider inclusion of a more ENaC-selective inhibitor if required.

      3.4

      The RFS associations for

      SCNN1B and SCNN1G are compelling. It would be helpful to report whether the associations persist in a multivariable model that at least includes ER status, grade and nodal status where available, or to state clearly when this is not possible across merged datasets. Even a sensitivity analysis in TCGA with ER+ cases only would contextualise the hazard ratios.

      We have analysed TCGA and METABRIC cohorts individually using Cox proportional hazards models, as this functionality is not available for merged datasets in KMplot. ER status was included in the models, and analyses were additionally stratified by ER-positive and ER-negative subgroups. The number of relapse events per subgroup is approximately 40; therefore, additional covariates such as grade and nodal status were not included given the limited number of events per model.

      Within ER-positive patients, high SCNN1G expression is associated with poorer relapse-free survival (TCGA HR 1.45, p = 0.0027), while SCNN1B shows a similar trend that does not reach statistical significance. These analyses are presented in Figure 6C and in the results section of the preliminary revision, and support the findings from the Kaplan–Meier analysis.

      3.5

      The spatial association of EDHR with EMT hotspots is a nice piece of the story. A short clarification of how spot-level cell type composition was handled will help readers interpret proximity results. If cell type deconvolution scores are available in the source dataset, adding a sentence on whether EDHR enrichment tracks tumour epithelial content would be useful.

      Spatial cell type composition and spot annotations were used as provided in the SpottedPy dataset, based on Cell2location-derived deconvolution scores and STARCH tumour annotations, without additional re-estimation.

      To address the reviewer’s suggestion, we examined the relationship between EDHR enrichment and epithelial content and observed no significant correlation at the neighbourhood level.

      These points have now been clarified in the manuscript.

      3.6

      Data processing for ChIP-seq and RNA-seq is documented and accessions are provided. The RNA-seq includes n=3 per condition, which is appropriate, and the correlation and LFC analyses are clearly presented. For the amiloride assay, the two-way ANOVA with interaction is appropriate; please add the exact n and whether experiments were independently repeated, and include the underlying values in a source table for transparency. These are small presentational edits rather than new experiments.

      In the preliminary revision we have added a statement to the amiloride assay figure (Figure 6D) clarifying that n = 3 independent biological replicates were performed per condition. In addition, we now provide the underlying numerical values for this assay in Table S11.

      3.7

      A small, hypothesis-driven mechanistic link from EDHR to ENaC function would substantially elevate impact without becoming a long project. For example, testing whether hypoxia increases amiloride-sensitive Na⁺ current in MCF7 and whether fulvestrant abrogates that increase would directly connect the transcriptional and functional observations. If available, patch-clamp or a simple SBFI-based Na⁺ imaging readout could suffice.

      We agree that directly linking EDHR to ENaC channel activity would further strengthen the mechanistic connection. We will prioritise genetic validation of ENaC function through siRNA-mediated depletion (response 2.1), which directly tests the requirement for ENaC in the hypoxia-specific viability phenotype.

      We are willing to explore the feasibility of measuring the amiloride-sensitive Na+ currents under normoxia and acute hypoxia (via perfusion of cells with bathing solution bubbled with nitrogen during recording) ± fulvestrant to further connect hypoxic regulation to channel activity.

      Minor Comments

      3.8

      Please show representative ERα ChIP-seq browser snapshots for at least one gained, one conserved and one lost locus alongside input for both conditions.

      We have now included representative ERα ChIP-seq browser snapshots for gained, conserved, and lost loci, together with input controls for both conditions, in Figure S3 of the revised manuscript.

      3.9

      In Figure 1D, the ATAC-seq comparison uses 0.1% O₂ for 48 h while the RNA-seq uses 1% O₂. Briefly justify the choice and discuss any expected differences.

      We thank the reviewer for this point. The ATAC-seq dataset was generated under 0.1% oxygen in the original study, whereas RNA-seq experiments in this work were performed at 1% oxygen to reflect tumour-relevant hypoxic conditions. The more severe hypoxia used for ATAC-seq would be expected to maximise detection of chromatin accessibility changes. Despite this, chromatin accessibility changes were limited, with ERα binding occurring predominantly at pre-accessible regions. This has now been clarified in the manuscript.

      3.10

      In the Methods for spatial analyses, specify the thresholds for hotspot calling and how the neighbourhood radius was chosen.

      The neighbourhood parameter was set to 8, corresponding to the immediate neighbouring spots in Visium data, consistent with package guidance. We have clarified this in the manuscript text.

      3.11

      For the EDHR heatmap, consider marking the 14 consensus genes and indicating which belong to the ENaC module to aid readability.

      We have marked the 14 EDHR consensus genes and indicated the ENaC module in the revised heatmap to aid readability.

      3.12

      Please report exact sample sizes and replicate numbers in all figure legends and provide a single table with all statistical tests, n, and p values.

      We have reported exact sample sizes and replicate numbers in all relevant figure legends and included Table S11 summarising all statistical tests, sample sizes (n), and p values.

      3.13

      A schematic summarising the experimental timelines for ChIP-seq, RNA-seq and viability would help orient readers.

      We have added timelines for these experiments as requested.

      3.14

      Minor copyedits: consistent formatting of O₂, gene symbols and reagent catalogue numbers.

      We have standardised oxygen notation throughout the manuscript to use “oxygen” in the main text and “O2” where appropriate (e.g. figures).

      Reagent catalogue numbers have now been standardised for consistency of presentation in the revised manuscript.

      Gene and protein nomenclature were already formatted according to accepted conventions and were verified for consistency.

      3.15

      The manuscript is well referenced. Where you contrast your findings with long-term CoCl₂ hypoxia, a sentence on why acute DMOG and short-term 1% O₂ may reveal different ERα behaviours would help position the novelty.

      We thank the reviewer for this suggestion. We have expanded the manuscript to clarify that acute hypoxia (1% oxygen) and DMOG treatment capture early, dynamic hypoxic responses, in contrast to chronic CoCl2 exposure, which reflects longer-term adaptation. This distinction is relevant to tumour biology, where hypoxia is often transient due to unstable vascularisation. The following statement has been added to the manuscript:

      “In addition to such chronic hypoxic adaptation, tumour hypoxia can also be dynamic, with cells experiencing acute or transient hypoxic exposure due to unstable vascularisation; an established contributor to tumour progression (Liu et al, 2022a; Koh & Powis, 2012). Thus, in contexts where both signalling pathways remain active, the dependence of the hypoxic response on ERα in ER+ cells has not been previously characterised.”

      Primary Limitations

      3.16

      DMOG vs hypoxia in the cistrome experiment,

      To address concerns regarding the use of DMOG, we have validated key ERα binding events from the ChIP-seq dataset by ChIP–qPCR at the TFF1 and GREB1 loci under bona fide hypoxia (1% oxygen) in biological triplicate__ (Figure 1F)__. These data demonstrate consistent changes in ERα binding under hypoxia, supporting that the DMOG-induced redistribution reflects hypoxia-driven changes.

      3.17

      the absence of direct HIF or cofactor perturbations

      We acknowledge the absence of direct HIF perturbation. To address this, we will assess the contribution of HIF signalling through stabilisation approaches, including RT-qPCR analysis of SCNN1B and SCNN1G ± IOX5 ± fulvestrant (response 3.2), to determine whether HIF activation is sufficient to support ERα-dependent induction.

      3.18

      and the pleiotropy of amiloride.

      To address the potential pleiotropy of amiloride, we will perform siRNA-mediated knockdown of SCNN1G and SCNN1B to provide independent validation of ENaC-dependent effects (response 3.3).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This study explores how hypoxia reshapes ERα signalling in ER-positive breast cancer and whether this cross-talk exposes targetable vulnerabilities. The authors first map ERα binding in MCF7 cells after dioxygenase inhibition with DMOG and observe a genome-wide redistribution with enrichment of ERE, FOXA1 and AP-1 motifs at gained sites while chromatin accessibility at these loci appears unchanged in public ATAC-seq after hypoxia. They then perform RNA-seq in MCF7 and T47D using a factorial design that combines fulvestrant-mediated ERα degradation with 1% O₂ to define an ERα-dependent hypoxia response (EDHR). A 14-gene consensus EDHR signature includes ENaC regulatory subunits SCNN1B and SCNN1G, whose higher expression is associated with poorer RFS in ER+ cohorts. Functionally, amiloride increases viability in normoxia but reduces viability under hypoxia in MCF7 across a dose range. Spatial transcriptomics from ER+ tumours shows EDHR expression enriched at the margins of hypoxia and estrogen-hallmark regions and adjacent to EMT hotspots. Raw data and code availability are stated for the central datasets and accessions are provided. Together the results argue that ERα helps organise a distinct hypoxic programme and suggest a context-specific sensitivity to ENaC inhibition.

      Major comments

      The paper addresses a timely question with a clear narrative arc and brings together ChIP-seq, RNA-seq, pharmacology, survival analysis and spatial transcriptomics. The EDHR concept is interesting and the ENaC angle is original. The work is already strong and with a few targeted additions and clarifications it can be made more persuasive without becoming a new project.

      1) The DMOG ChIP-seq provides a valuable first look at ERα redistribution. Since DMOG inhibits both HIF hydroxylases and oxygen-dependent demethylases, the driver of the observed changes remains ambiguous. It would help to include either ERα ChIP-seq under bona fide hypoxia or a selective PHD inhibitor condition (for example IOX5, as you discuss) to separate HIF stabilisation from broad demethylase inhibition. If ChIP-seq is not feasible, a brief ATAC validation at a small panel of gained and lost loci would still increase confidence. Estimated time: 6-8 weeks for a focused follow up with two conditions and biological duplicates/triplicates.

      2) The factorial RNA-seq is well designed and the attenuation analyses are clear. The EDHR selection is stringent and reproducible across two ER+ lines. To support the claim of ERα dependence mechanistically, a small number of targeted perturbations would go far. For example, confirm EDHR induction for SCNN1B and SCNN1G in hypoxia with and without fulvestrant by RT-qPCR and test whether short-term ERα knockdown reproduces the effect. A complementary test with a HIF-1α or HIF-2α knockdown at one time point would help position EDHR relative to HIF. Estimated time: 3-4 weeks for qPCR and siRNA validations.

      3) The amiloride result is intriguing and consistent with a hypoxia-specific dependency. Because amiloride is pleiotropic, it would strengthen the conclusion to add one genetic and one pharmacological specificity control. A brief SCNN1B or SCNN1G knockdown in hypoxia should phenocopy the viability effect if ENaC contributes. In parallel, testing benzamil at sub-micromolar doses would provide a more ENaC-selective pharmacological readout. These can be performed in MCF7 and, resources permitting, in T47D. Estimated time: 4-6 weeks.

      4) The RFS associations for SCNN1B and SCNN1G are compelling. It would be helpful to report whether the associations persist in a multivariable model that at least includes ER status, grade and nodal status where available, or to state clearly when this is not possible across merged datasets. Even a sensitivity analysis in TCGA with ER+ cases only would contextualise the hazard ratios. Estimated time: 1-2 weeks.

      5) The spatial association of EDHR with EMT hotspots is a nice piece of the story. A short clarification of how spot-level cell type composition was handled will help readers interpret proximity results. If cell type deconvolution scores are available in the source dataset, adding a sentence on whether EDHR enrichment tracks tumour epithelial content would be useful. Estimated time: 1 week.

      Reproducibility and statistics

      Data processing for ChIP-seq and RNA-seq is documented and accessions are provided. The RNA-seq includes n=3 per condition, which is appropriate, and the correlation and LFC analyses are clearly presented. For the amiloride assay, the two-way ANOVA with interaction is appropriate; please add the exact n and whether experiments were independently repeated, and include the underlying values in a source table for transparency. These are small presentational edits rather than new experiments.

      Optional

      A small, hypothesis-driven mechanistic link from EDHR to ENaC function would substantially elevate impact without becoming a long project. For example, testing whether hypoxia increases amiloride-sensitive Na⁺ current in MCF7 and whether fulvestrant abrogates that increase would directly connect the transcriptional and functional observations. If available, patch-clamp or a simple SBFI-based Na⁺ imaging readout could suffice. Estimated time: 6-8 weeks.

      Minor comments

      1. Please show representative ERα ChIP-seq browser snapshots for at least one gained, one conserved and one lost locus alongside input for both conditions.
      2. In Figure 1D, the ATAC-seq comparison uses 0.1% O₂ for 48 h while the RNA-seq uses 1% O₂. Briefly justify the choice and discuss any expected differences.
      3. In the Methods for spatial analyses, specify the thresholds for hotspot calling and how the neighbourhood radius was chosen.
      4. For the EDHR heatmap, consider marking the 14 consensus genes and indicating which belong to the ENaC module to aid readability.
      5. Please report exact sample sizes and replicate numbers in all figure legends and provide a single table with all statistical tests, n, and p values.
      6. A schematic summarising the experimental timelines for ChIP-seq, RNA-seq and viability would help orient readers.
      7. Minor copyedits: consistent formatting of O₂, gene symbols and reagent catalogue numbers.

      Prior studies

      The manuscript is well referenced. Where you contrast your findings with long-term CoCl₂ hypoxia, a sentence on why acute DMOG and short-term 1% O₂ may reveal different ERα behaviours would help position the novelty.

      Significance

      General assessment

      The strongest aspects are the carefully designed factorial RNA-seq that cleanly separates ERα and hypoxia effects, the discovery of a concise EDHR signature reproducible across two ER+ lines, and the integration with spatial transcriptomics that places EDHR near EMT-rich tumour regions. The ENaC connection is new and potentially actionable, and the context-dependent amiloride response is a practical lead. Limitations are primarily mechanistic: DMOG vs hypoxia in the cistrome experiment, the absence of direct HIF or cofactor perturbations, and the pleiotropy of amiloride.

      Advance

      To my knowledge, this is the first description of a distinct ERα-dependent hypoxic programme in ER+ breast cancer that includes ENaC regulatory subunits and links to an EMT-adjacent spatial niche. The conceptual advance is the positioning of ERα as a coordinator of a subset of hypoxia-induced genes rather than as a parallel pathway, together with an initial functional readout that suggests a therapeutic angle through ENaC modulation. With the targeted additions outlined above, the study would move from strong association to a more mechanistic and translationally relevant model.

      Audience

      The work will interest a specialised audience in nuclear receptor biology, hypoxia signalling, tumour microenvironment, and ion transport in cancer. It has potential relevance for basic researchers studying ERα cistrome dynamics, for groups using spatial transcriptomics to define micro-niches, and for translational researchers exploring metabolic and ionic vulnerabilities in ER+ disease.

      Expertise disclosure

      Keywords: nuclear receptors,, chromatin profiling, transcriptomics, spatial transcriptomics, breast cancer biology.

      I am not a domain expert in ion channel electrophysiology; my comments on ENaC pharmacology focus on specificity and study design rather than detailed channel biophysics.

      Tone

      I find the paper well conceived and already compelling. The suggested experiments are focused, realistic in scope, and primarily aim to turn several strong associations into concise mechanistic statements that would further increase confidence and impact.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      ERα drives most luminal breast cancers. However, how hypoxia reshapes ERα activity and how ERα itself might influence the hypoxic response remain unclear. Understanding this interaction is crucial, as hypoxia is strongly linked to endocrine resistance and poor outcomes. In this study, authors investigated how hypoxia modifies ERα signalling in ER+ breast cancer and whether ERα contributes to the transcriptional response to low oxygen. Using MCF7 and T47D cells, they combined genome-wide profiling of the ERα cistrome under DMOG, hypoxic transcriptomics with or without ERα degradation, and spatial transcriptomics in tumours. This revealed an ERα-dependent hypoxic response (EDHR), prominently involving regulation of epithelial sodium channel (ENaC) subunits, whose expression requires both hypoxia and active ERα signalling. Functionally, ENaC inhibition with amiloride reduced cell viability under hypoxia. Together, these findings uncover a previously unrecognised ERα-dependent layer of the hypoxic transcriptome and identify ENaC as a potential therapeutic vulnerability in hypoxic ER+ breast cancer. Although the study is interesting, the manuscript lacks several essential functional and experimental validations. ENAC is proposed as a therapeutic vulnerability based on amiloride sensitivity assays. Additional experiments are required, such as western blot validation of ENaC regulation under hypoxia and loss-of-function approaches to assess its contribution to the phenotype. Fulvestrant is used to dissect ERa dependency. However, as a SERD, it may alter chromatin and transcription independently of a simple loss of ERα. Addition control would strengthen interpretation. The molecular mechanism by which ERα modulates the hypoxic transcriptome, specifically how ERα and HIF pathways converge at ENAC loci should be more studied. In addition, to assess the relevance of this work for luminal breast cancer and ERα expression, specific validation in TNBC should be performed Finally, the authors should provide RT-qPCR validation of the key EDHR genes, especially since this signature is later used for downstream analyses.

      Significance

      General assessment strengths:

      This study uncovers a previously unrecognised ERα-dependent hypoxic response in breast cancer, revealing that ERα actively shapes the hypoxic transcriptome rather than functioning as an isolated pathway. To me, the main strength of this work is the identification of ENaC as a novel hypoxia-specific therapeutic vulnerability in ER+ breast cancer, suggesting that ion-channel regulation may play a broader and underappreciated role in endocrine resistance.

      Limitation:

      Reprogramming of the ERα cistrome under cellular stress is well documented. The study extends these ideas but does not clearly demonstrate a new mechanistic paradigm, particularly because the EDHR is defined primarily through omics approaches without strong mechanistic validation. In addition, we have to keep in mind that the study uses DMOG to model hypoxia-driven chromatin changes, but DMOG inhibits many 2-oxoglutarate-dependent dioxygenases non-selectively. This makes it difficult to attribute ERα cistrome reprogramming specifically to hypoxia, rather than to broad off-target effects. The transcriptomic dataset is more convincing by need the validation suggested previously.

      Audience:

      Given its reliance on omics datasets and preliminary functional assays, the paper will likely appeal to a specialized audience in transcriptional regulation, hypoxia signalling, and ER+ breast cancer biology. However, the limited mechanistic depth and uncertain translational relevance due to the lack of in vivo validation, may reduce its impact for broader oncology or therapeutic-development audiences. Without stronger validation, the findings may be perceived as niche and mainly of interest to researchers focused on ERα chromatin dynamics rather than to the wider cancer research community.

      Expertise:

      My evaluation is based on my background in breast cancer, ERα signaling and breast tumorigenesis. However, I have limited expertise in spacial transcriptomic analyses and advanced CHiP-seq bioinformatic analyses, which may affect my assessment of some computational analyses.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Malcom et al. present evidence that, under hypoxic conditions, hypoxia-inducible factors (HIFs) alter the estrogen receptor alpha (ERα) epigenomic landscape in a model of estrogen receptor-positive (ER+) breast cancer (BrCa). The response of ER+ BrCa to estradiol (E2) in MCF7 (ER+) cells, as well as ERα signaling in both primary and metastatic breast cancer, has been well studied, and the epigenomic landscape of ERα+ BrCa is well documented. The differentially expressed genes (DEGs) identified under treatment with the hypoxia mimetic dimethyloxalylglycine (DMOG) revealed a subset of ERα-dependent hypoxic response (EDHR) genes. The outcome was a reprogramming of the basal ERα cistrome, coinciding with sites enriched for estrogen response elements (EREs) and co-transcription factor binding motifs for ERα, including FOXA1 and AP-1. This was demonstrated by ERα ChIP-seq (i.e. DMOG) and ATAC-seq (i.e. 1% O2) performed under different hypoxic conditions. The transcripts identified following DMOG treatment were leveraged and compared to publicly available RNA-seq datasets from various breast cancer subtypes exposed to 1% hypoxic oxygen. Although the comparison methods varied, the results suggested that BrCa cell lines under 1% hypoxic oxygen conditions showed strong similarity to MCF7 cells treated with DMOG. Genes upregulated in response to DMOG correlated with poorer survival outcomes. To demonstrate the requirement for ERα in this model, MCF7 cells were treated with the selective estrogen receptor degrader (SERD) fulvestrant-the only FDA-approved SERD for ER+ BrCa-showing a dampening of the HIF response among EDHR genes. This suggests that ERα is necessary for the expression of DEGs under hypoxic conditions induced by DMOG. Finally, the sodium channel protein ENaC subunits (i.e., SCNN1B and SCNN1G) were further characterized as candidate EDHR genes. Analyses of publicly available datasets indicated that high mRNA expression levels of these subunits were associated with worse survival outcomes, supporting the clinical relevance of EDHR genes SCNN1B and SCNN1G. To further validate clinical relevance, utilize the Spatial Transcriptome in a small subset of ER+ BrCa.

      Major:

      1. Overall, the study lacks well-controlled experiments comparing hypoxia induced by DMOG with hypoxia induced by 1% O₂ for assessing ERα occupancy throughout.
      2. Lack of evidence for other co-transcription factors impact under hypoxia HIF's in Fig1.
      3. Lack of evidence for DMOG induce HIF protein expression in MCF7 cells.
      4. Figure 1: ATAC-seq was performed under 1% O₂, whereas ChIP-seq was conducted with DMOG treatment, making these conditions not directly comparable.
      5. Figure S1: ERα ChIP lacks estradiol (E2) treatment in MCF7 cells with or without DMOG. The single-gene examples of DMOG effects shown in Fig. S1A are not significant.
      6. Figures S2 and S3: Fig. S2 lacks 1% O₂ conditions, and Fig. S3 lacks DMOG-induced HIF factor assessments.
      7. Figure S4: Estradiol (E2) treatment is missing from the controls, and the figure labeling is of poor quality. Hypoxic conditions for assessing ER status and appropriate controls are also lacking.
      8. Figure S5: The description of fulvestrant treatments under hypoxic conditions is unclear.
      9. Supplemental legends: These require major revision; they are of poor quality and lack statistical details and references to biological replicates.

      Minor:

      1. Overall comparisons throughout the manuscript are weak; the figures appear sloppy and lack sufficient effort in presentation.

      Significance

      In general, the manuscript in its present form does not greatly contribute from published work as the ERα cistrone is well documented work studied for its role in regulating gene expression, particularly in ERα-positive breast cancer. Additionally, a lack of a through comparison between DMOG and or 1 %O2 induce hypoxia in the MCF7 ER+ model, diminished initial interest in the manuscript. The lack of considering estradiol exposure under hypoxic conditions with either 1%O2 and or DMOG also limits relevance to patients with ER+ BrCa. The ERα epigenomic profile has been extensively studied including work under hypoxic conditions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We sincerely appreciate the constructive and insightful comments on our manuscript.

      Both reviewers raised important concerns regarding our use of the term lysosome-related organelle. We fully acknowledge this criticism and will revise the terminology throughout the manuscript with greater care, referring to these structures as Rab32/Rab38-positive vacuoles where appropriate, and discussing their possible relationship to lysosome-related organelles in the Discussion.

      We believe that the remaining comments can be adequately addressed through additional experiments, including CLEM and three-dimensional reconstruction analyses. We therefore submit this revision plan and hope that it will be viewed favorably.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript employs overexpression and knockdown experiments in an immortalized hepatocyte cell line to probe roles for RAB32 and RAB38 in lipid metabolism by lysosomes or lysosome-related organelles (LROs). Using these approaches, the authors show that both RAB32 and RAB38 colocalize with LAMP1 on late endosomes/ lysosomes, that the appearance of enlarged, round lysosomal structures that they refer to as LROs scales with both RAB32 and RAB38 expression, and they provide some evidence to suggest that material from lipid droplets (LD) are taken up into these large rounded compartments in a manner that requires RAB32 or RAB38. Additional experiments are interpreted to suggest that macroautophagy is not required for this uptake but that PtdIns3-kinase, PtdIns5-kinase, and ESCRT complexes are required. Analyses of Rab32/Rab38 knockout mice shows an accumulation of white fat, and in liver an accumulation of what the author interpret to be lipofuscin. The authors conclude that lipid droplets are consumed by LROs in an autophagy-independent manner.

      Major Comments:

      While the topic of the paper is interesting, the conclusions of the paper are not supported by the data shown. No evidence is presented in this paper that the structures analyzed are actual LROs rather than lysosomes, other than their content of RAB32 and RAB38 - which are not limited in expression to LROs. The fact that lipid accumulates in the white fat and not the livers of double knockout mice and that hepatocytes express very little RAB32 and no RAB38 renders the model cell system studied here artifactual; the paper should start with the in vivo analysis and then progress with an appropriate cell type using a line that mimics the behaviour of the endogenous cells. Moreover, the only experiments documenting partial overlap of lipid droplet (LD) material - interpreted as uptake of LDs - into these structures is in cells that massively overexpress LAMP1-mRFP, RAB32, and/or RAB38; in untransduced cells, only a handful of LAMP1-containing structures are enlarged and there is no evidence that they overlap with LD material. Moreover, the only evidence that colocalization is independent of autophagy is that it is blocked by overexpression of a single dominant-negative autophagy component, ATG4B. Finally, the data quantification throughout the paper lacks sufficient power to support the conclusions. Thus, the none of the major conclusions from this paper are well supported, and the physiological significance of the observations for liver function is not at all clear. Altogether, the authors present an interesting idea for which the data are unconvincing.

      Below are detailed concerns throughout the paper.

      1. Abstract:

      i. Please explain why there was a reason to look at the involvement of Rab32/38 in hepatic lipid metabolism.

      ii. It seems rather unlikely that microautophagy can result in the engulfment of an entire lipid droplet in toto; is it more sensible to think of this as a means to transfer the contents of LDs, perhaps piece by piece, into lysosomes? 2. Introduction:

      i. There is a vast literature on the roles of Rab32 and/or Rab38 in the biogenesis of other LROs besides melanosomes, including platelet granules, lamellar bodies in lung epithelial type II cells, and various non-vertebrate structures that should be cited.

      ii. The authors fail to cite the first papers describing roles of Rab32 or Rab38 in bacterial killing by macrophages (Spano et al 2012, PMID: 23162001 and several additional papers from the Galan/ Spano groups), and papers ascribing roles for Rab32 in mitophagy and perhaps other mitochondrial functions, including ER:mitochondrial contacts, prior to the authors' 2025 paper (various papers).

      iii. There have been quite a few papers addressing Rab32/38 effectors in pigment cells (see papers from the Di Pietro group) and other cell types (see Rab32 in mitochondria papers).These facts and at least some of the papers should be cited in the Introduction to better reflect the depth of understanding - and some of the confusion - surrounding Rab32 and Rab38 function.

      iv. Reference to the definition of LROs should also be cited.

      Results: 3. In all experiments where quantification was done, the number of structures or cells analyzed is listed but not the number of experiments. Were these experiments repeated at least three times, and are the values and statistics calculated from the experiment to experiment variation? If not, the statistical values are inaccurate. In all, the number of structures or cells analyzed appears to be quite small. 4.Figure 1.

      i. How did the authors validate the specificity of the anti-Rab32 and anti-Rab38 antibodies used in Figure 1 and elsewhere? Data should be shown with individual knockdowns. Additionally, the overlap with LAMP1 seems too good to be true (it looks 100% and with similar labeling intensities in all cases) - were controls done to ensure lack of cross-reactivity of the secondary antibodies?

      ii. If anti-Rab32 and -Rab38 actually labeled all LAMP1-positive compartments, it seems likely that these are classical late endosomes/ lysosomes and not lysosome-related organelles. Rab32 is expressed by many cell types that do not harbor traditional LROs and may have more ubiquitous functions. The larger ring-like structures mentioned in the text only appear when Rab32 or Rab38 are overexpressed as GFP fusion proteins (compare Fig. 1A and B with 1C-F, and note that the scale bars are the same) and fail to overlap with smaller structures only when LAMP1-mRFP is overexpressed (compare Fig. 1A and B with S1A); these structures likely represent earlier endosomal intermediates illuminated by LAMP1 overexpression. The authors need to reconsider their interpretation of these data in light of these overexpression artifacts.

      iii. In Fig. 1C-F and Fig. S1, were cells transfected or infected with recombinant lentiviruses? This should be indicated in the figure legend. 5. Figure 2. In Fig. 2E-G, cells depleted of Rab32 and/or Rab38 should be compared to cells transduced with a control shRNA, such as a non-coding shRNA, and not to untransduced cells. The quantification of these data "per field" is quite concerning, given that a field could have very different numbers of cells. The data should be normalized to cell number or cell area. 6. Figure 3.

      i. It should be noted in the text that the Lipi- dyes fluoresce in high hydrophobic environments, and thus would indicate a cluster of lipid tails within a lysosome and not just an entire LD. Interpreting these spots as LD under lipase inhibitory conditions is a stretch.

      ii. The evidence that the Lipi-Blue labeled structures are actually inside of the lysosomal structures is not convincing. Three-D reconstructions would need to be done to be more convincing of this. 7. Suppl. Fig. S2. In panel A, there is no obvious difference in intensity of p62 under any of the conditions, and this reviewer does not see any LC3-II in the gel- only LC3-I with a very slight smear underneath that may or may not be specific. The interpretation that autophagy is increased at higher confluency is thus not well founded. In panel B, I see weak labeling of the interior of the giant Rab38-GFP-containing compartments for LC3-mRFP, as if the mRFP was in the process of degradation. How this correlates with the biochemistry in panel A is unclear. 8. Fig. 4 and Suppl. Fig. S3.

      i. All of the graphs in Fig. S3 require appropriate statistical analyses.

      ii. The interpretation of the size of the structures in the double DKD sample is complicated by their accumulation in the perinuclear area, which is very dense. If all samples look like the one in Fig. 3A, then it is not possible to measure their size by this technique and that sample should remain unanalyzed. It is misleading to refer to these as large when they appear to be clusters of small puncta.

      iii. The label on the image itself in Fig. 3C should indicate Lysotracker, not "LRO". This is misleading.

      iv. The same concern raised above that it is not clear whether the Lipi-Blue labeled structures are present within the lysosomal structures is true here. Indeed, in the unstransfected control, many of the LD structures appear to be present adjacent to (on one side of) the Lysotracker-labeled structures, as is also apparent in the shRab32 and shRab38 cells; those where they appear to be inside might simply be above them in these non-super-resolution images. This is a great example of how it is necessary to do 3D reconstructions to fully determine whether the Lipi-Blue structures are engulfed by or adjacent to lysosomes.

      v. Note, the LC3 flux experiment and identification of LC3-II and -I is correct in S4D, unlike the experiment in S2A. 9. Fig. 5. The data in Figure 5A are incorrectly interpreted. PtdIns3P or PtdIns(3,5)P2 are present only on the cytoplasmic leaflet of endosomes and lysosomes; if those membranes were to be internalized, the phosphate would be removed. Thus, the presence of signal on the inside of the lysosomal structures does not indicate the presence of PtdIns3P or PtdIns(3,5)P2; it represents likely free mCherry, or perhaps the full conjugate with 2XFYVE, that has been engulfed by the lysosome and is no longer bound to its ligand. The observation that the mCherry signal accumulates near the Lipi-Blue signal in orlistat-treated cells thus cannot be interpreted as an interaction of the phosphoinositide with the LD or its content phospholipids or acyl chains. The disappearance of a punctate 2XFYVE signal is expected upon treatment with a PI3kinase inhibitor since it eliminates the ligand, and the failure of Lipi-Blue to accumulate in lysosomes of inhibitor-treated cells could reflect just about any defect in endolysosomal maturation since PtdIns3P is required for the early to late endosome transition as well as for several aspects of late endosome and lysosome biology. All this experiment shows is that uptake of Lipi-Blue labeled structures into lysosomes requires endolysosomal maturation. The same goes for the shVps4 experiments in Fig. 5B, which are also less convincing of any phenotype, and Fig. S5.

      Significance

      Because the conclusions are not supported by the data shown and because the authors exploit an immortalized cell type that does not mimic the behavior of the endogenous cells, the significance of the work as presented is very low. If the conclusions were justified, the advance could potentially be conceptual in showing that RAB32 and RAB38 redundantly functionalize lysosomes in some cell types to metabolize lipids through a mechanism distinct from macroautophagy. Such an advance would be of broad interest to investigators interested in the functions of lysosomes and lysosome-related organelles, as well as membrane trafficking machinery. However, the authors are unfortunately a long way from such an advance.

      My expertise is in the biogenesis of LROs, and I am considered a leading expert in the field. In my opinion, the authors require a functional readout unique to LROs to define the compartments shown as LROs. Otherwise, they might consider altering their language, abandon the LRO designation, and focus on mechanisms of fatty acid uptake promoted by RAB32 and/or RAB38 in appropriate cell types. Unfortunately, their own data show that the cell type used here is not such an appropriate cell type.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study investigates the roles of Rab32 and Rab38 in hepatic lipid droplet metabolism. The authors propose that Rab32/38-positive lysosome-related organelles (LROs) mediate lipid droplet degradation through a mechanism independent of conventional macroautophagy. While the study addresses an interesting question, several conceptual and technical issues need to be addressed before the conclusions can be fully supported.

      Major Concerns

      1.The authors primarily define the Rab32/38-positive ring-like structures as "lysosome-related organelles (LROs)" based on their morphological characteristics and co-localization with LAMP1. However, this classification lacks biochemical validation. Would it be more appropriate to include a Lyso-IP assay to provide additional supporting evidence? 2.In hepatocytes, what is the operational definition of LROs? Beyond being "larger in size," how are these structures functionally distinguished from conventional lysosomes? If Rab32/38 defines LRO identity, why does GFP-Rab32/38 not co-localize with all LAMP1-positive structures (Figure S1A)? 3.In Figure 2A, the dextran pulse-chase experiment shows fluid-phase uptake into large vacuoles; however, dextran can enter any endocytic compartment after prolonged chase periods. What evidence supports that these structures are bona fide LROs rather than enlarged late endosomes or lysosomes resulting from long-term culture? What determines why only certain lysosomes become Rab32/38-positive? This heterogeneity is not explained. Does it imply that pre-existing lysosomes convert into LROs, or that LROs are newly formed under high-density stress? The developmental trajectory of these structures has not been explored. 4.The authors propose a microautophagy mechanism based on the "invagination-like" structures observed by light microscopy (Figure 3A). However, the resolution of light microscopy is insufficient to distinguish true membrane invaginations from lipid droplets that are closely apposed to, or partially wrapped by, the outer membrane of LROs in three-dimensional space. Would a CLEM experiment be necessary to confirm that lipid droplets are indeed located within the lumen of LROs, rather than in deep invaginations that remain connected to the cytosol? In addition, multilamellar membrane structures were observed after Bafilomycin A1 treatment (Figure 3A). Have these structures been validated by electron microscopy, or could they simply represent complex membrane infoldings within swollen lysosomes? The conclusions drawn from light microscopy alone appear somewhat insufficient. 5.The authors use ATG4B C74A overexpression to claim macroautophagy independence. However, while this mutant blocks LC3 lipidation, the study still lacks genetic evidence, such as ATG knockouts. In Figure S2B, the authors state that the "majority" of Rab38-positive LRO-associated lipid droplets are LC3-negative, but no quantitative data are provided. 6.The manuscript does not clearly distinguish the functions of Rab32 and Rab38. Although the authors describe these proteins as paralogs with overlapping roles, multiple data points indicate that they have differential effects on lipid droplet (LD) metabolism. Notably, Rab38-but not Rab32-significantly affects LD delivery to acidic compartments, exerts a stronger influence on LRO size, and responds more robustly to VPS4B perturbation. These observations suggest that Rab32 and Rab38 regulate distinct steps of LD metabolism rather than functioning redundantly. However, the manuscript does not clearly highlight these functional differences and lacks mechanistic validation. 7.Figure 5A shows that the PI3P probe (2×FYVE) forms ring-like structures inside or near the LRO membrane. However, LROs themselves are Rab5-negative (Figures 1C-E), and PI3P is typically generated by Vps34 on early endosomes. Where do these PI3P signals originate? Are they transported from other organelles, or is there a local PI3P-generating mechanism on the LRO membrane? If the latter, which kinase is responsible, and is Vps34 recruited to the LRO membrane? This issue is not discussed. If PI3P is indeed locally generated on LROs, it could represent a key feature distinguishing LROs from classical lysosomes.

      Minor Concerns

      1.The double-knockout mice exhibit obesity and fatty liver; however, Rab32 and Rab38 are expressed in multiple tissues. A whole-body knockout model cannot distinguish whether these effects are hepatocyte-autonomous or arise from contributions by adipose tissue or macrophages, emphasizing the need for liver-specific knockout animals or cell models. Serum TAG levels were unchanged, and the authors speculate that VLDL secretion may be impaired, but this was not directly tested. Furthermore, the authors do not address the observed sex-specific effects, which appear to be male-specific. 2.The concentration of Orlistat used is relatively high (50-200 μM) and may cause non-specific effects. Have dose-response experiments been performed, or have other LAL inhibitors (e.g., Lalistat) been tested? 3.LysoTracker reflects acidity rather than lysosome identity, and reduced acidification in DKD cells may affect co-localization analysis.

      Significance

      Assessment of Significance Overall Assessment

      Strengths:

      Conceptual novelty: Introduces lysosome-related organelles (LROs) into hepatic lipid metabolism, expanding the functional repertoire of Rab32/38 beyond pigment cells and macrophages.

      Mechanistic exploration: Links LD uptake to PI3P/PI(3,5)P2 signaling and VPS4B, providing molecular handles for future studies.

      In vivo validation: DKO mice show age-dependent obesity and HFD sensitivity, establishing physiological relevance.

      Weaknesses:

      Rab32 vs. Rab38 functions remain blurred: Data suggest differential roles (Rab38 in LD delivery, Rab32 in LD size regulation), but authors default to "redundancy" narrative.

      Microautophagy evidence incomplete: Relies on light microscopy; EM/CLEM needed to confirm true internalization.

      Model relevance unclear: High-confluence AML12 vacuoles lack clear physiological correlate in healthy liver.

      Audience

      Primary:

      Lysosome biologists

      Autophagy researchers

      Lipid metabolism researchers

      Secondary:

      Cell biologists

      Metabolic disease researchers

      Geneticists

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      1) Summary

      This study investigates the mechanochemistry of Arp2/3-mediated branched actin networks at the level of individual branch junctions under load. Using microfluidic single-filament/branch force assays (including constant-force flow and open-chamber imaging) the authors quantify debranching, re‑nucleation, and mother- vs daughter‑interface stability across nucleotide states of Arp2/3 (ADP-Pi, ADP, and an ADP-BeFx proxy for ADP-Pi). They further test effects by two branch regulators (GMF and cortactin). Key findings include: (i) ADP-Pi and ADP complexes share similar force dependence but differ markedly (~20×) in intrinsic dissociation rate; (ii) phosphate turnover on the Arp2/3 complex is rapid ii) affinity for Pi drops when Arp2/3 loses its daughter filament; (iii) quantification from model fits uncovers large stability differences between daughter and mother interfaces of the Arp2/3 complex; (iv) extraordinary high stability of ADP-Pi-like Arp2/3 on the mother filament; and (v) distinct effects of GMF and cortactin on force‑dependent stability. Overall, the work combines technically demanding measurements with mechanistic modeling to probe how nucleotide state and regulatory factors tune branch mechanics.

      2) Major comments:

      1. Low force kinetics and completeness of survival curves (Figure 1). "For all forces, the surviving curves exhibited a clear single exponential behavior...." While the data can be fitted to monoexponential decay curves, data at low forces is clearly incomplete. >90% of branches have not dissociated by the end of the experiment. For the particular data shown in 1C (F00nN, n=60 total branches) it means that the time information is coming from

      Essential; experiment might already be performed. Otherwise straightforward to do (weeks time).

      In figure 1B, we indeed show a Survival curve for ADP-Arp2/3 complex branch dissociation at 0 pN up to 900 seconds. As now shown in updated supp figure S2, the data was in fact acquired for at least 5000 seconds for ADP-Arp2/3 and ADP-Pi states (N=2 repeats for each condition, with n = 60 and 90 branches for ADP-Arp2/3 branches, and 90 and 132 branches for ADP-Pi-Arp2/3 branches). The debranching rates reported in the initial submission were already obtained by fitting the surviving curves over the whole duration of the experiments.

      1. Stability Analysis (Figure 4). I can follow much of the arguments presented in the stability analysis of the daughter vs mother interfaces, which is in principle extremely interesting! However, there are some concerns here:

      i) The authors emphasize the zero force ratio derived from fits (which is linked to the stability difference of the two interfaces in the absence of force) despite this being only weakly constrained by data. Intuitively in the model, the stability difference should grow to very large values as the re-nucleation ratio approaches 1 at low force. This combined with the noise in the data poses an issue in my opinion. Looking at the data and the error margin, I think that the authors cannot state with high confidence that there is a real difference between the relative stability of the daughter and mother interfaces between the two nucleotide states of the complex.

      Essential; analysis and textual revision only

      We thank the reviewer for this comment. The difference in stability between the two interfaces is strongly constrained by the shape of the branch renucleation ratio versus force curve, and its value at 0 pN. This is illustrated in the figure shown below (new Supp Fig. S8), showing the dissociation rates of the two interfaces (in 'dashed' and 'point-dashed' style) that contribute to the overall debranching rate in each nucleotide condition. Despite the limited force range at which we probed the debranching rate, the branch renucleation ratio curve informs us on which interface is the weakest, and how this evolves with force.

      We have assessed the confidence intervals of the parameters obtained from the fits, taking into account the error bars on our experimental datapoints. It seems to indicate that the simultaneous fits of the debranching rate and the branch renucleation ratio curves indeed constrain the parameters quite strongly. These confidence intervals are now reported in the main text and in the summarizing table.

      We have repeated branch renucleation experiments for ADP-BeFx- and ADP-Pi-Arp2/3 complex branches (see new figure 4C&D, and our response to the next point). We believe these new measurements allow a better assessment of the relative stability between the two interfaces for Arp2/3 complex branch junctions in the ADP-BeFx state.

      Still, we agree with the reviewer that the dispersion of the experimental data does not allow us to have a strong confidence on the crossover force and relative stability difference of the interfaces. Therefore, we have slightly toned down the way we present and discuss the differences in stability when comparing the two nucleotide states.

      ii) For ADP-Pi, the renucleation ratio essentially remains flat over the measured force range. Hence, the data can only provide little leverage to estimate both the zero force ratio and, more importantly, the differential distance to the transition state in the slip-bond model in my opinion, which will show in the crossover force. Consequently, the quoted ">100×" stability difference at F=0 and the crossover force >20pN are driven largely by extrapolation rather than direct constraint by data. Given the high number of free parameters in the model, I would anticipate that several crossover forces and differential distances might explain the data nearly equally well. Instead of loosely reporting exact number from fits, I would have hoped for some sort of sensitivity analysis, for instance relying on profile likelihoods. Also parameter values could be reported as bounds (e.g crossover force≫measured range) rather than precise point estimates. This issue re-occurs (albeit not as drastically) for the cortactin experiments (Figure 6).

      Essential; analysis and textual revision only

      As mentioned in our response to the previous point, we have repeated renucleation experiments for ADP-BeFx- (and also for Arp2/3 complex branches in the presence of 50 mM Pi) (see new figure 4C&D) to better characterize the differential distance between to the transition force. The crossover force for the ADP-BeFx state is now 13.5 pN and the ratio of the stability between the two interfaces is roughly 100 times.

      We agree with the reviewer that the dispersion of the experimental data does not allow us to have a strong confidence on the crossover force and relative stability difference of the interfaces. We have thus toned down the way we report these values. We do believe though that the difference we report between the ADP and ADP-BeFx state appears to be significant and needs to be acknowledged.

      As a side note, it has proven to be challenging to pull on branches at forces higher than 7 pN. To apply a large force on the branch junction, we need to have a high flow rate. In this case, it appeared that the height of the filaments (both mother and daughter filaments) above the surface seem to deviate from what we have established in our previous studies (Jegou et al, Nat. Comm. 2013 & Wioland et al, PNAS 2019). This may originate from the fact branched filaments have a more complex shape than an individual filament. Characterizing accurately the evolution of the branch height as a function of the flow rate and applied force would require quite extensive additional characterization, which, we believe, is beyond the current focus of this study on the stability of Arp2/3 complexes.

      iii) One important expectation from the "two slip bond" model is that branch dissociation rates should not necessarily scale mono-exponentially as they mostly do over the accessible force range of the paper. However, once the "minor" pathway of dissociation from the mother starts to dominate at high forces, rates become more force sensitive. This is nicely recaptured by the model fits in Figure S6 but deserves some explanation in the text. Otherwise, people will simply remember the "ADP-Pi is 20-fold more stable than ADP at all forces" message.

      Essential; textual revision only

      We now have rephrased the key sentences (in the Abstract and Results sections) to more clearly state that the debranching rate is not increasing mono-exponentially with force.

      In the Abstract: "Remarkably, we find that branch junctions are over 30-fold more stable when the Arp2/3 complex is in the ADP-Pi rather than ADP state, and that force accelerates debranching with similar exponential factors in both states."

      In the Results section: "The debranching rate seems to increase exponentially with the applied pulling force, in the range of 0 to 6 pN (Fig. 1F; see more refined analysis below). This behaviour is predicted by the Bell-Evans model for a slip bond."

      iv) One important prerequisite for the model is that isolated Arp2/3 complexes (without a daughter filament) should dissociate with equal rates from mother filaments at all flow rates. Since the Arp2/3 complex prefers mother filament curvature, forces experienced by the mother might change its off-rate. It would be good to refer to this assumption in the text and experimentally verify it. I could not find it in the paper nor in Ghasemi et al 2024.

      Essential; simple experiment (a weeks time).

      We thank the reviewer for this important comment.

      First, we investigated whether the viscous drag force, applied on the ADP-Arp2/3 complexes which remain bound to mother filaments could affect their stability. We have performed branch renucleation experiments at different flow rates but with the same pulling force on branch junctions (average force 3.9 pN) by adapting the length of the daughter filament. As shown in new supp. figure S11 (shown below), we did not observe any significant differences between 'low' and 'high' flow rates. If the off-rate of the surviving Arp2/3 was significantly affected by the flow, this would have led to a variation of the renucleation ratio with the flow rate.

      Second, we have investigated the impact of the tension experienced by the mother filament at the location of the branch junction for ADP-Arp2/3 complex branches, with the same pulling force on the branches (average 4.1 pN pulling force on branches). We have quantified the debranching rate from three groups of branches depending on their position along mother filaments. As shown in new supp. figure S12 (shown below), we can observe a small trend, where the debranching rate decreases with the tension on the mother filament at the branching point.

      Doubling the tension on the mother filament from 15 to 30 pN decreases the debranching rate by a third. Though, pairwise logrank tests performed between the survival fractions of the three binned groups do not report any statistical significant difference (all p values > 0.05). One possible explanation for this is the height of the mother filament in the microfluidics flow that increases linearly from the anchoring point to the free barbed end. As a consequence the pulling force on the branches will be higher, as branches experience faster flows.

      For these same groups, upon branch dissociation, all remaining-bound Arp2/3 complexes are exposed to the same flow rate; the branch renucleation ratios were similar. Thus branch renucleation ratio seems to not significantly depend on the tension experienced by the mother filament at the branching point.

      Similarly, Pandit et al PNAS 2020, Extended figure S1, also reported no detectable impact of the mother filament tension on the debranching rate in their assay.

      v) The force dependence of the branch re-nucleation rate (Fig 3D) has been measured previously by the same group (Ghasemi et al). While the data in the older paper has not been fitted by a model, the trend of the data in the previous paper looks conspicuously different. Are there any explanations for this? I speculate that it might be related to actin and ATP not being saturated (low-force re-nucleation rate rarely exceeds 80%) in Ghasemi et al., but it would be good to know what the authors think about this. Essential; textual revision only

      This is a good point. We have plotted the data of the renucleation ratio from ADP-Arp2/3 complex from figure 1F of Ghasemi et al, Sc. Adv. 2024 (performed at 0.3 and 1 µM actin), together with the data of the current study from figure 4D (performed at 1.5 µM actin). We feel this comparison could be of interest to the readers, and have thus integrated it in the manuscript as new supp. figure S13 (shown below).

      As expected, the branch renucleation ratio is lower with lower concentrations of actin. The experimental data points from Ghasemi et al are similarly well fitted by the branch renucleation function obtained for 1.5 µM multiplied by a scaling parameter, which reflects the fact that the branch renucleation ratio is actin concentration dependent (Fig. 6A in Ghasemi et al). This scaling parameter was the only free parameter of those fits.

      Since the branch renucleation ratio depends on the actin concentration as follows, 0.97.kon.([actin] - Cc)kon.([actin] - Cc)+koffATP-Arp2/3 , with kon = 3.4 µM-1.s-1 and koff ATP-Arp2/3 = 0.66 s-1 from (Ghasemi et al. 2024), the scaling parameter obtained by the fits give estimates of the actin concentration in these experiments, of 0.6({plus minus}0.05) and 0.9({plus minus}0.2) µM for the experiments performed at 0.3 and 1 µM respectively in (Ghasemi et al. 2024).

      1. Stability of the authentic ADP-Pi-Arp2/3 complex on the mother filament. The extraordinary stability of the isolated ADP-BeFx-Arp2/3 complex on mother filaments is surprising, especially considering that both ATP and ADP states are much more labile (Ghasemi et al 2024). I would recommend repeating this experiment in the authentic ADP-Pi state with labelled Arp2/3 complexes as a more direct readout, even if this would require working with very high phosphate concentrations.

      Essential; simple experiment (a weeks time).

      We have followed the recommendation of the reviewer and have performed new experiments using fluorescent Arp2/3 complexes for ADP, ADP-BeFx and ADP-Pi states, now displayed in new figure 5C (also shown below).

      For fluorescent Arp2/3 complexes remaining bound to the mother filament, the Arp2/3 complex - mother filament interface is ~ 100 times more stable in the ADP-BeFx state (0.0046 s-1) compared to the ADP state (0.56 s-1). We also assessed the dissociation of surviving ADP-BeFx-Arp2/3 complexes using unlabelled Arp2/3 complexes (previously in figure 4B, repeated experiment shown in new supp. figure S10), which also indicates a remarkable stability.

      The dissociation curve of surviving Arp2/3 complexes in the presence of 50 mM Pi and 200 µM ATP in solution reflects the mixture of Arp2/3 dissociating in the ADP/ATP state and ADP-Pi-Arp2/3 that can either dissociate in the ADP-Pi state or lose their Pi and dissociate in the ATP state. Despite the presence of 50 mM Pi, the rate at which ADP dissociates and ATP reloads rate is much faster than Pi binding. Fitting this survival curve with a function that accounts for the initial double populations and the evolution of the ADP-Pi population (see Methods) gives a good estimate of the Pi release rate.

      OPTIONAL: Further, but beyond the scope of the present paper, would be titrating phosphate in these experiments, which would even allow the authors to independently verify the reduced Pi affinity for Arp2/3 in the mother filament. Of note, this affinity difference is needed to satisfy detailed balance in the reaction scheme (Fig 4 D)!

      We thank the reviewer for this suggestion. High concentrations of phosphate in the buffer renders glass surfaces quite sticky in our assays. We've tried several different passivation strategies (BSA, PLL-PEG, K-casein, ...) but none gave satisfactory results. So titrating phosphate, by going beyond 50 mM phosphate, proved to be quite challenging.

      Detailed balance, considering the two possible routes connecting the ADP-Pi-Arp2/3 complex branch junction state and the surviving ADP-Arp2/3 complex state, can be written as KPi rel.branch junction . Kdebranching ADP-Arp2/3 = KdebranchingADP-Pi-Arp2/3 . KPi rel.surviving Arp2/3.. Some of these affinity constants are not known, because of the inability to determine reverse reactions rates such as the rebinding of a daughter filament to a surviving Arp2/3. It is thus hard to determine how the affinity of Pi for Arp2/3 complex changes between Arp2/3 complexes at branch junctions and surviving Arp2/3 complexes on mother filaments.

      While we cannot determine the affinity constant of Pi for a surviving Arp2.3 complex, our data indicates that the dissociation rate of Pi is higher from Arp2/3 complexes at branch junction (koff = 0.21 s-1) than from surviving Arp2/3 complexes (koff = 0.05 s-1). This unexpected finding indicates that surviving Arp2/3 complexes adopt a conformation where the nucleotides are readily exchanged, but where the 'back door' for Pi release is less open. We now discuss this point in our revised manuscript.

      1. Importance of "surviving" ADP-Pi-Arp2/3 complexes. The authors show a) rapid turnover of Pi on the ADP-Arp2/3 complex in both branch- or mother filament-bound state and b) the lowered Pi affinity of the latter. Nonetheless, they emphasize the importance of long-lived "surviving" ADP-Pi bound complexes on the mother (even stated in the abstract). I understand that this fraction shows under some experimental conditions (BeFx), but unless I am missing something, most complexes should rapidly lose their phosphate and either exchange nucleotide or dissociate from the mother under physiological conditions. Please clarify or tone done.

      Essential; textual revision only

      We thank the reviewer for their remark. We have tried to clarify this aspect in the manuscript.

      As shown now with the departure rate of fluorescent surviving Arp2/3 complexes together with branch renucleation data, we show that surviving ADP-Pi-Arp2/3 complexes are quite stable on mother filaments, because they detach and release their Pi slowly, such that branch regrowth will occur provided there is actin in solution. In the absence of actin monomers, as the reviewer correctly points out, the surviving ADP-Pi-Arp2/3 will predominantly release its Pi and thus become a surviving ADP-Arp2/3 complex. We have modified the text to avoid any confusion.

      1. GMF mechanism. The authors claim that GMF "...accelerates the departure of the surviving Arp2/3 complex from the mother...". I assume that they infer this from decrease in the re-nucleation ratio. However, alternatively GMF could simply dwell on the complex, inhibiting re-nucleation without promoting dissociation from the mother. The authors should either monitor Arp2/3 dwell times directly to discriminate between these possibilities or be more cautious in their conclusions.

      Essential; simple experiment (a weeks time) or textual revision.

      In Ghasemi et al. Sci. Adv. 2024, we examined the departure of Arp2/3 from the mother filament after GMF-induced debranching using fluorescent Arp2/3. Most of the fluorescent Arp2/3 dissociated from mother filaments within the same frame as the branch, i.e. within 0.5 seconds after the debranching event, and none were visible after another second . This could be due to Arp2/3 departing with the branch or an accelerated departure after branch dissociation. In any case, this rules out the possibility that GMF would dwell on the surviving complex for a substantial amount of time without promoting dissociation from the mother.

      In the present manuscript, we now show that increasing the ATP concentration 10-fold (from 0.2 to 2 mM) is sufficient to restore the branch renucleation ratio to its level without GMF. This shows that GMF does not cause Arp2/3 to leave with the branch, but rather that it (also) acts on the surviving Arp2/3 complex, in a way that is countered by high concentrations of ATP. More specifically, it suggests that GMF accelerates the departure of the surviving ADP-Arp2/3 complex, either directly and by hindering the reloading of ATP, and that GMF does not affect the surviving Arp2/3 complex once it has reloaded ATP.

      We now discuss these two non-mutually exclusive possibilities for the accelerated dissociation of the surviving ADP-Arp2/3 complex in the manuscript.

      6.Cortactin mechanism and the "leash model". I must say that the cortactin data are the most puzzling part of the paper and hard to reconcile with what we know from structure. I was hoping to find some of this resolved in the discussion. However, I do not understand the "leash model" in the discussion section for cortactin-mediated branch stabilization: "This would explain the observed increase in branch survival compared to the absence of cortactin. As the pulling force is increased, this rebinding mechanism becomes less efficient." According to my understanding of the data, this is opposite to what happens. Cortactin only stabilizes the labile interface at elevated forces! Some re-writing might help here.

      Essential; textual revision.

      We thank the reviewer for having us think more thoroughly about the model we initially proposed. We now believe that our 'leash' mechanism is not able to fully recapitulate our observations in a simple and satisfactory manner.

      We now propose a much simpler model, where the binding of cortactin to the Arp2/3 complex at the branch junction simply changes the energy landscape of the Arp2/3-daughter interface without the need to invoke a rebinding of the daughter filament upon branch departure. We have updated our interpretation of the data in the Discussion section accordingly.

      Overall, our results on the impact of cortactin on branch renucleation highlights a surprising behaviour that would require further investigation to fully decipher the underlying molecular mechanism.

      3) Minor comments

      Organization: - I do not want to impose on how to best tell the story, but I felt that Fig1 A-D and Fig 2 A-B belong to one logical unit (nucleotide dependence), whereas Fig 1 E-F and Fig 2 C belong to the other (Pi binding and exchange). Perhaps consider re-organizing to streamline presentation?

      We thank the reviewer for their suggestion. We agree that it flows more naturally as suggested, and have made the changes! Thank you.

      Semantics/Typos: - Abstract: „... ADP-Pi and ADP-Arp2/3 detach with the same exponential increase as a function of force...". Increase should refer to the dissociation rate, which should be added to the sentence.

      We have corrected this.

      Results page 8: "...and the majority of Arp2/3 complexes detach from the mother filament while remaining bound to the branch at the debranching time." "Branch" should likely be daughter here, as there is no branch after dissociation of either interface.

      We have corrected this, thank you.

      Results page 13: "Exposing ADP-BeFx-Arp2/3 complex branch junctions to a saturating amount of GMF...". It is strange to imply saturation, because GMF likely simply does not bind to the complex in this nucleotide state with appreciable affinity. Suggest to change to "high".

      We have made the changes accordingly.

      Discussion page 18: "Moreover, in mammalian Arp2/3, His80 in Arp3 (corresponding to His73 in mammalian actin) is not methylated, and corresponds to residue N77 in Arp3, which is also not modified." N77 likely belongs to Arp2?

      We have made the changes accordingly.

      Discussion page 19: "We showed that Pi affinity for Arp2/3 complexes at branch junctions is around 3.7 mM (Fig. 1), a value which lies within the reported 1-10 mM Pi concentration measured in the cytosol in different mammalian cell types". Notably, this is not too different from F-actin, which should be mentioned. By this measure alone, free inorganic phosphate could also directly regulate actin filament stability!

      We now mention this and discuss that intracellular Pi can also impact actin filament nucleotide state.

      Future interest (non essential): - It would be utterly exciting (but beyond current scope) to quantify how instantaneous debranching rates evolve for naturally aging branches starting from ATP-Arp2/3 complexes!

      We thank the reviewer for this remark. It is indeed quite beyond the scope of the current study, as this would require a way to probe ATP-Arp2/3 complex branches while daughter filaments are still quite short (so pulling on them is difficult). An interesting alternative could be to use ATP analogs, such as App-NHp (aka AMP-PNP), to stabilize this state. However, some studies have mentioned that App-NHp is not very stable.

      Significance

      General assessment:

      This is a compelling and carefully executed study that delivers a clear mechanistic framework for how Arp2/3 branch junctions fail and re‑form under load. The central strength is the tight integration of state‑of‑the‑art reconstitutions with careful and original kinetic analysis. The experimental design is elegant and experiments have been carried out to a masterful standard. The figures are clear, the statistics are appropriate with some exceptions as detailed above. There are very few labs in the world that could have achieved this feat!

      A few aspects could be further strengthened, most notably the explanation and application of the "two slip bond" model as well as slightly more restraint in speculating around specific regulatory mechanisms. However, these are minor refinements that do not detract from the important contributions of the paper.

      Overall, the clearly work merits publication with high priority after revision; most requested changes are textual/analytical with very few targeted experiments, which would substantially strengthen core claims.

      We thank the reviewer for their positive evaluation of our manuscript. We hope that our responses to the detailed points above, along with the corresponding revisions of the manuscript, will alleviate their concerns.

      Advance relative to prior literature: The major novel findings of the paper are already summarized above. There is some recent work done on the subject of branch mechanics by the authors (Ghasemi et al 2024, PMID: 38277459) and others (Pandit et al 2020 PMID: 32461373), but the focus of the present work is clearly unique and the there is plenty of novel insight.

      Audience and impact: Primary audience: specialists in cytoskeleton dynamics, in vitro reconstitution single molecule biophysics, and mechanobiochemistry. Secondary: researchers in cell motility, morphogenesis and mechanobiology, physicists working on active matter and modelers studying force producing and load-bearing biopolymer networks. The results and analysis framework should inform quantitative models of branched network turnover under load and the interpretation of regulatory factor action in vivo and in cells.

      Reviewer expertise: Actin dynamics; biochemical reconstitution; single molecule approaches; biophysics.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Xiao et al examine the molecular events occurring when Arp2/3 complex-mediated actin filament branches are removed from mother actin filaments. They do this using microfluidics assay with purified proteins combined with single filament TIRF imaging of branched actin filaments with distinct fluorescent labels. The contribution of different nucleotide states of Arp2/3 complex are tested in conjunction with the relationship force exerted on the branches and regulatory protein involvement from GMF and cortactin. The data seem comprehensive and highly quantified in response to concentration, force, fraction of branches and survival times and branching rates. They find that ADP-BeFx and high phosphate concentrations (leading to the ADP-Pi state) leads to a slower debranching rate at a given level of force applied. The ability to rapidly switch the buffer gives powerful information about response times of debranching compared with other actin remodelling events. They use renucleation experiments to determine that the previous debranching event most often occurs at the Arp2/3 complex/daughter interface, showing that filaments will be ready to re-branch in the stable ADP-Pi bound state. GMF addition allows debranching of the ADP state to occur at a lower force. Cortactin acts similarly to the ADP-Pi state to increase branch stability.

      Specific comments

      The pulling force on the branches seems to arise from different flow rates in the microfluidics. Viscous drag is mentioned and I can see there is methylcellulose in the buffer. It would be helpful to have the explanation of the conversion between flow and force, even if it has been standard in previous work.

      We apologize if this was unclear: in microfluidics experiments, the buffer does not contain methylcellulose. Methylcellulose is only used for 'open chamber' experiments, where no force is applied to Arp2/3 branches, to maintain them in the TIRF field of excitation (Figure S2).

      To better clarify the conversion between flow and force, we have rephrased and extended the Methods section to explain how the force on the branch junction is computed based on the local flow velocity and the length of the daughter filament.

      Pg 5 - what was the motivation to titrate phosphate? It seems a stretch that intracellular Pi levels are tuning branching inside cells more than protein-mediated control (GMF or cortactin) - can the authors evidence this at all?

      We are not claiming that the level of Pi plays a stronger regulatory role than proteins. We show that inorganic phosphate tunes the state of the Arp2/3 complex, which in turn modulates the action of regulatory proteins, such as GMF and cortactin.

      Nonetheless, we do show that the contribution of inorganic phosphate is quite central as it can (1) strongly stabilize branch junctions (~30-fold decrease in the dissociation rate), and (2) tune the activity of GMF and cortactin on Arp2/3 complexes at branch junctions as well as on the 'surviving' Arp2/3 complexes that remain bound to mother filaments.

      We thus titrated phosphate and found that its impact on Arp2/3 complex stability is significant in the range of Pi concentration that is explored in cells. For the sake of completeness, and following a comment from reviewer #1, we now also mention the affinity of Pi for actin subunits in filaments in the Discussion, and discuss the impact of intracellular Pi on actin itself.

      Minor comments

      • In the introduction, while the structural and mutagenesis evidence is clearly stated, in other cases a bit more detail would be helpful e.g. 'biochemical studies', which referred measurement of hydrolysis rates using radiolabelling

      We have made changes to more precisely define which biochemical assays were used in previous studies.

      • Page 3 Figures shouldn't be referenced in the introduction

      We have removed the references to the figures from the introduction.

      • Page 3 slip bond behaviour needs explanation

      We now explain the concept when first using this concept in the manuscript, as follows: "The debranching rate seems to increase exponentially with the applied pulling force, in the range of 0 to 6 pN (Fig. 1F; see more refined analysis below). This behaviour of accelerated debranching with the increase of the applied force is similar to the 'slip bond' concept, as predicted by the Bell-Evans model of the force-dependent lifetime of the interaction between two proteins".

      • Figure 1B seems to be a theoretical schematic which is superfluous

      We suppose that the reviewer is actually referring to figure 3B of the initial manuscript, describing the energy potential of a molecular interaction as a function of the reaction coordinate. We agree with the reviewer that it is not absolutely required and we have removed it.

      • Figure 4D is helpful, different weight lines might help even more to explain the dominant pathways

      We have made modifications to the biochemical reaction scheme in this figure (now figure 5F in the revised version). We hope we succeeded in improving its readability. Since the different paths depend on mechano-chemical parameters, there is no real dominant pathway per se.

      **Referee cross-commenting**

      Rev1 sounds like the specialist here. I can't comment on their requests. Some similar points arise between the reviewers which need addressing.

      Reviewer #2 (Significance (Required)):

      Significance

      Taking a look at references 16 and 19, I do not find it clear what is achieved differently in the current work compared to these papers and what agrees and what disagrees. If it's a species difference I might expect the two species would be analysed side-by-side in this paper.

      We thank the reviewer for this important comment. The goal of our study was not to compare the behaviour of mammalian and yeast Arp2/3 complexes.

      We now try to better explain that the motivation of the present work is to address how the nucleotide state of the Arp2/3 complex tunes actin branch mechanosensitive stability, and regulates interactions with well known Arp2/3 complex binding proteins. Most of the reactions are quantified here for the first time. Moreover, the experiments with branch junctions in different nucleotide states are done under controlled mechanical conditions, providing the first direct measurements of the force-dependence of the debranching reactions. Our detailed kinetic analysis of the full reaction scheme allows us to model the different binding interfaces of the Arp2/3 complex.

      In addition, it is worth noting that:

      1. Species matter and this is why ref 16 and 19 can give the impression to disagree on the ability to renucleate branches thanks to the stability of surviving Arp2/3 complexes on mother filaments.
      2. In ref 16 (Pandit et al, PNAS 2020) species are mixed (yeast Arp2/3 and mammalian alpha actin from skeletal muscle), likely leading to a different behaviour compared to the only mammalian protein situation we examine in our current work. In particular, with mixed species one misses the ability to renucleate, as shown in our previous study Ghasemi et al (ref 19). However, since mixing species does not correspond to anything physiological, we do not think it is worth repeating these conditions alongside our experiments.
      3. Further, the analysis carried out in ref 16 suffers from important limitations: the force was unknown (not calibrated) and the data was fitted by a model that compounded several reactions, providing only an indirect estimation of the rates, in particular at zero force. In contrast, we have worked with calibrated forces (including dedicated experiments at zero force) and we have carried out specific experiments to directly measure several rates.
      4. In ref 19 (our earlier work) we did not investigate the impact of the nucleotide state of the branch junction at all, and we did not systematically measure the dissociation rates as a function of force. Contrary to Pandit et al, we directly measure the difference in branch stability at zero force between ADP and ADP-Pi states and show that the ~ 30 fold difference holds true at all probed forces. Last, the force dependence of the branch renucleation success rate gives us crucial information on which of the two Arp2/3 complex interfaces ruptures first.

      I'm not understanding how the authors can distinguish effects of adding phosphate and BeFx on Arp 2 and 3 compared to effects on actin. Importantly, are possible accompanying changes in the actin filament a confounding factor?

      We have checked that the nucleotide state (ADP-BeFx and ADP-Pi versus ADP) of the mother and daughter filaments have no impact on branch stability:

      • In the experiments shown in figure 2F, where the buffer condition to which branches are exposed is quickly changed from phosphate buffer to buffer without phosphate, we observe a rapid change of branch stability. Actin subunits at the branch junction are in F-actin conformation according to recent cyroEM observations (ref. Chavani et al, Nat Comm. 2024; Liu et al, NSMB 2024). These actin subunits, initially in the ADP-Pi state, are expected to age and become ADP with a rate of ~ 0.007 s-1 (ie half-time of 100 s; ref. Jegou et al, PLoS Biology 2011, Ooosterhert et al, NSMB 2023), a much lower rate than the observed change of the debranching rate (0.21 s-1). This means that the debranching rate is independent of the nucleotide state of daughter and mother filaments.

      • In new supp. Figure S4, we show that the debranching rate is similar for ADP-Arp2/3 complex branch junctions initiated from ADP- or ADP-BeFx-actin mother filaments.

      • In new supp. Figure S9, we initially exposed branch junctions to a BeFx solution then monitored debranching and branch renucleation in our standard buffer (ie without BeFX or Pi). We observed multiple rounds of branch renucleation, the first with ADP-BeFx-actin daughter filaments, and the following with daughter filaments never exposed to BeFx. They all had the same debranching rates and renucleation success rates.

      The paper is quite specialist to read and the advance appears to be incremental. My expertise is in molecular pathways to actin regulation outside the main area of the paper.

      The results we present in this study are often unexpected, and some go counter long-standing assumptions. The regulation of Arp2/3-nucleated branches is of importance for the stability and the force-generating capabilities of many actin networks in cells. Last, most of the measurements that we present had never been done, mainly because experiments are difficult to achieve, and require specific tools to monitor several events while controlling the applied force.

      We believe our results are of broad interest as they go counter long-standing assumptions. We have rewritten the text in several instances to convey our message more clearly.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Please find enclosed the review of the manuscript "Inorganic phosphate in Arp2/3 complex acts as a rapid switch for the stability of actin filament branches" by Xiao et al.

      The authors provide a detailed investigation of how the nucleotide bound to the Arp2/3 complex affects branch stability under flow force. From a kinetic perspective, this is an elegant study with generally high-quality data, although some conclusions rest on assumptions rather than direct experimental evidence.

      We thank the reviewer for their positive feedback. We have improved our manuscript and performed important additional experiments to provide more direct experimental evidence of our conclusions.

      A key question concerns the physiological relevance of these findings. For instance, the concept of branch regrowth may not be applicable in cellular contexts, since forces by actin polymerization would displace existing branches away from sites where they generate this active forces. The authors should clarify the relevance of regrowth during active force generation by branched networks.

      We thank the reviewer for this comment. Our in vitro results indeed point to a previously unreported property of branched actin networks, i.e. the ability of Arp2/3 complexes to readily renucleate branches in the ADP-Pi state and that it does require reloading ATP within Arp2/3.

      Branched actin networks, especially the lamellipodia or endocytotic patches, do exert active force thanks to actin polymerization of the individual branches at the forefront. Though, the whole actin network is exposed to stress, and the architecture of the network (inter-branch distance, crosslink between branches, ...) presumably strongly impact its mechanical properties.

      In the case of other types of branched actin networks, such as the actin cortex, myosin motor put the whole network under tension. Such pulling forces on actin branches, depending on the amplitude of the pulling force, can lead to branch regrowth, and network self-repair.

      We have modified the text to make the physiological relevance clearer.

      Additionally, all experiments employ flow conditions that branches would probably not experience in cells-notably, the flow direction in the cellular context would be reversed. Altering the flow direction relative to the branches could affect not only the relationship between flow rate and branch stability, but potentially other system properties as well.

      We agree with the reviewer that in cells branches will not experience flow conditions similar to the ones we use in our in vitro assay. Nonetheless, in cells we expect mechanical stress on the branch junction to be applied in all directions. In lamellipodia, the compressive force applied at the leading edge is expected to result in diverse local orientations of the force on individual branch junctions within the network (as explained in Lappalainen et al. Nat Rev MBC 2022). Also, branch junctions are found in the cell cortex, where they are exposed to pulling forces resulting from the action of myosin motors and crosslinkers on mother and daughter filaments.

      This impact of the direction of the flow was addressed in our previous publication (Ghasemi et al, Sc. Adv. 2024, figure 2) and, to a lesser extent, by the lab of Enrique de la Cruz in Pandit et al, PNAS 2020 (ref. 16). We reported that flow direction has a minimal effect, if any, on branch dissociation rate and renucleation ratio.

      Reviewer #3 (Significance (Required)):

      Furthermore, the study appears not to account for the mother filament (particularly its nucleotide state) or the actin subunit bound to the Arp2/3 complex. The authors should discuss why their interpretation focuses exclusively on the Arp2/3 complex rather than on the actin filaments or Arp2/3-bound actin subunit.

      We have checked that the nucleotide state (ADP-BeFx and ADP-Pi versus ADP) of the mother and daughter filaments has no impact on branch stability :

      • In the experiments shown in figure 2F, where the buffer condition to which branches are exposed is quickly changed from phosphate buffer to buffer without phosphate, we observe a rapid change of branch stability. Actin subunits at the branch junction are in F-actin conformation according to recent cyroEM observations (ref. Chavani et al, Nat Comm. 2024; Liu et al, NSMB 2024). These actin subunits, initially in the ADP-Pi state, are expected to age and become ADP with a rate of ~ 0.007 s-1 (ie half-time of 100 s; ref. Jegou et al, PLoS Biology 2011, Ooosterhert et al, NSMB 2023), a rate much lower than the observed change of the debranching rate (0.21 s-1). This means that the debranching rate is independent of the nucleotide state of daughter and mother filaments.

      • In new supp. Figure S4, we show that the debranching rate is similar for ADP-Arp2/3 complex branch junctions initiated from ADP- or ADP-BeFx-actin mother filaments.

      • In new supp. Figure S9, we initially exposed branch junctions to a BeFx solution then monitored debranching and branch renucleation in a regular buffer. We observed multiple rounds of branch renucleation, the first with ADP-BeFx-actin daughter filaments, and the following with daughter filaments never exposed to BeFx. They all had the same debranching rates and renucleation success rates.

      An important concern involves the use of KPi (inorganic phosphate). Based our experience, KPi appears to have effects beyond simply impacting nucleotide state-actin filaments seem to assemble differently in the presence of KPi. The authors should exercise caution in their interpretation of KPi-based experiments.

      Concentration of KPi (up to 50 mM Pi) did not slow down barbed end elongation rate in our experiments.

      Overall, while the technical quality and kinetic analyses are state-of-the-art, relating this work to physiological contexts remains challenging, and some conclusions appear overstated.

      We have made changes in the discussion to try to more clearly relate our in vitro observations and conclusions with the cellular context where branch renucleation could have a strong impact on the architecture and mechanics of actin networks.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Please find enclosed the review of the manuscript "Inorganic phosphate in Arp2/3 complex acts as a rapid switch for the stability of actin filament branches" by Xiao et al.

      The authors provide a detailed investigation of how the nucleotide bound to the Arp2/3 complex affects branch stability under flow force. From a kinetic perspective, this is an elegant study with generally high-quality data, although some conclusions rest on assumptions rather than direct experimental evidence.

      A key question concerns the physiological relevance of these findings. For instance, the concept of branch regrowth may not be applicable in cellular contexts, since forces by actin polymerization would displace existing branches away from sites where they generate this active forces. The authors should clarify the relevance of regrowth during active force generation by branched networks.

      Additionally, all experiments employ flow conditions that branches would probably not experience in cells-notably, the flow direction in the cellular context would be reversed. Altering the flow direction relative to the branches could affect not only the relationship between flow rate and branch stability, but potentially other system properties as well.

      Significance

      Furthermore, the study appears not to account for the mother filament (particularly its nucleotide state) or the actin subunit bound to the Arp2/3 complex. The authors should discuss why their interpretation focuses exclusively on the Arp2/3 complex rather than on the actin filaments or Arp2/3-bound actin subunit.

      An important concern involves the use of KPi (inorganic phosphate). Based our experience, KPi appears to have effects beyond simply impacting nucleotide state-actin filaments seem to assemble differently in the presence of KPi. The authors should exercise caution in their interpretation of KPi-based experiments.

      Overall, while the technical quality and kinetic analyses are state-of-the-art, relating this work to physiological contexts remains challenging, and some conclusions appear overstated.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Xiao et al examine the molecular events occurring when Arp2/3 complex-mediated actin filament branches are removed from mother actin filaments. They do this using microfluidics assay with purified proteins combined with single filament TIRF imaging of branched actin filaments with distinct fluorescent labels. The contribution of different nucleotide states of Arp2/3 complex are tested in conjunction with the relationship force exerted on the branches and regulatory protein involvement from GMF and cortactin. The data seem comprehensive and highly quantified in response to concentration, force, fraction of branches and survival times and branching rates. They find that ADP-BeFx and high phosphate concentrations (leading to the ADP-Pi state) leads to a slower debranching rate at a given level of force applied. The ability to rapidly switch the buffer gives powerful information about response times of debranching compared with other actin remodelling events. They use renucleation experiments to determine that the previous debranching event most often occurs at the Arp2/3 complex/daughter interface, showing that filaments will be ready to re-branch in the stable ADP-Pi bound state. GMF addition allows debranching of the ADP state to occur at a lower force. Cortactin acts similarly to the ADP-Pi state to increase branch stability.

      Specific comments

      The pulling force on the branches seems to arise from different flow rates in the microfluidics. Viscous drag is mentioned and I can see there is methylcellulose in the buffer. It would be helpful to have the explanation of the conversion between flow and force, even if it has been standard in previous work.

      Pg 5 - what was the motivation to titrate phosphate? It seems a stretch that intracellular Pi levels are tuning branching inside cells more than protein-mediated control (GMF or cortactin) - can the authors evidence this at all?

      Minor comments

      • In the introduction, while the structural and mutagenesis evidence is clearly stated, in other cases a bit more detail would be helpful e.g. 'biochemical studies', which referred measurement of hydrolysis rates using radiolabelling
      • Page 3 Figures shouldn't be referenced in the introduction
      • Page 3 slip bond behaviour needs explanation
      • Figure 1B seems to be a theoretical schematic which is superfluous
      • Figure 4D is helpful, different weight lines might help even more to explain the dominant pathways

      Referee cross-commenting

      Rev1 sounds like the specialist here. I can't comment on their requests. Some similar points arise between the reviewers which need addressing.

      Significance

      Taking a look at references 16 and 19, I do not find it clear what is achieved differently in the current work compared to these papers and what agrees and what disagrees. If it's a species difference I might expect the two species would be analysed side-by-side in this paper.

      I'm not understanding how the authors can distinguish effects of adding phosphate and BeFx on Arp 2 and 3 compared to effects on actin. Importantly, are possible accompanying changes in the actin filament a confounding factor?

      The paper is quite specialist to read and the advance appears to be incremental. My expertise is in molecular pathways to actin regulation outside the main area of the paper.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This study investigates the mechanochemistry of Arp2/3-mediated branched actin networks at the level of individual branch junctions under load. Using microfluidic single-filament/branch force assays (including constant-force flow and open-chamber imaging) the authors quantify debranching, re‑nucleation, and mother- vs daughter‑interface stability across nucleotide states of Arp2/3 (ADP-Pi, ADP, and an ADP-BeFx proxy for ADP-Pi). They further test effects by two branch regulators (GMF and cortactin). Key findings include: (i) ADP-Pi and ADP complexes share similar force dependence but differ markedly (~20×) in intrinsic dissociation rate; (ii) phosphate turnover on the Arp2/3 complex is rapid ii) affinity for Pi drops when Arp2/3 loses its daughter filament; (iii) quantification from model fits uncovers large stability differences between daughter and mother interfaces of the Arp2/3 complex; (iv) extraordinary high stability of ADP-Pi-like Arp2/3 on the mother filament; and (v) distinct effects of GMF and cortactin on force‑dependent stability. Overall, the work combines technically demanding measurements with mechanistic modeling to probe how nucleotide state and regulatory factors tune branch mechanics.

      Major comments:

      1. Low force kinetics and completeness of survival curves (Figure 1). "For all forces, the surviving curves exhibited a clear single exponential behavior...." While the data can be fitted to monoexponential decay curves, data at low forces is clearly incomplete. >90% of branches have not dissociated by the end of the experiment. For the particular data shown in 1C (F00nN, n=60 total branches) it means that the time information is coming from <6 observations, which is rather low for the single molecule field. I am slightly worried by this point, since the debranching rates under ADP-Pi conditions at zero force, are even by one magnitude slower. Yet, no raw data is shown. Given that the dissociation rate at low forces is a contentious point, the authors should show the raw data and the corresponding fits. At present, they only show an experimental scheme and images for these "open chamber" assay (Fig S2). Ideally, they would image for much longer than 900s with lower sampling time in those assays, to firmly establish that 20-fold difference also holds at 0 force.

      Essential; experiment might already be performed. Otherwise straightforward to do (weeks time).

      1. Stability Analysis (Figure 4). I can follow much of the arguments presented in the stability analysis of the daughter vs mother interfaces, which is in principle extremely interesting! However, there are some concerns here:

      i) The authors emphasize the zero force ratio derived from fits (which is linked to the stability difference of the two interfaces in the absence of force) despite this being only weakly constrained by data. Intuitively in the model, the stability difference should grow to very large values as the re-nucleation ratio approaches 1 at low force. This combined with the noise in the data poses an issue in my opinion. Looking at the data and the error margin, I think that the authors cannot state with high confidence that there is a real difference between the relative stability of the daughter and mother interfaces between the two nucleotide states of the complex.

      Essential; analysis and textual revision only

      ii) For ADP-Pi, the renucleation ratio essentially remains flat over the measured force range. Hence, the data can only provide little leverage to estimate both the zero force ratio and, more importantly, the differential distance to the transition state in the slip-bond model in my opinion, which will show in the crossover force. Consequently, the quoted ">100×" stability difference at F=0 and the crossover force >20pN are driven largely by extrapolation rather than direct constraint by data. Given the high number of free parameters in the model, I would anticipate that several crossover forces and differential distances might explain the data nearly equally well. Instead of loosely reporting exact number from fits, I would have hoped for some sort of sensitivity analysis, for instance relying on profile likelihoods. Also parameter values could be reported as bounds (e.g crossover force≫measured range) rather than precise point estimates. This issue re-occurs (albeit not as drastically) for the cortactin experiments (Figure 6).

      Essential; analysis and textual revision only

      iii) One important expectation from the "two slip bond" model is that branch dissociation rates should not necessarily scale mono-exponentially as they mostly do over the accessible force range of the paper. However, once the "minor" pathway of dissociation from the mother starts to dominate at high forces, rates become more force sensitive. This is nicely recaptured by the model fits in Figure S6 but deserves some explanation in the text. Otherwise, people will simply remember the "ADP-Pi is 20-fold more stable than ADP at all forces" message.

      Essential; textual revision only

      iv) One important prerequisite for the model is that isolated Arp2/3 complexes (without a daughter filament) should dissociate with equal rates from mother filaments at all flow rates. Since the Arp2/3 complex prefers mother filament curvature, forces experienced by the mother might change its off-rate. It would be good to refer to this assumption in the text and experimentally verify it. I could not find it in the paper nor in Ghasemi et al 2024.

      Essential; simple experiment (a weeks time).

      v) The force dependence of the branch re-nucleation rate (Fig 3D) has been measured previously by the same group (Ghasemi et al). While the data in the older paper has not been fitted by a model, the trend of the data in the previous paper looks conspicuously different. Are there any explanations for this? I speculate that it might be related to actin and ATP not being saturated (low-force re-nucleation rate rarely exceeds 80%) in Ghasemi et al., but it would be good to know what the authors think about this.

      Essential; textual revision only 3. Stability of the authentic ADP-Pi-Arp2/3 complex on the mother filament. The extraordinary stability of the isolated ADP-BeFx-Arp2/3 complex on mother filaments is surprising, especially considering that both ATP and ADP states are much more labile (Ghasemi et al 2024). I would recommend repeating this experiment in the authentic ADP-Pi state with labelled Arp2/3 complexes as a more direct readout, even if this would require working with very high phosphate concentrations.

      Essential; simple experiment (a weeks time).

      OPTIONAL: Further, but beyond the scope of the present paper, would be titrating phosphate in these experiments, which would even allow the authors to independently verify the reduced Pi affinity for Arp2/3 in the mother filament. Of note, this affinity difference is needed to satisfy detailed balance in the reaction scheme (Fig 4 D)! 4. Importance of "surviving" ADP-Pi-Arp2/3 complexes. The authors show a) rapid turnover of Pi on the ADP-Arp2/3 complex in both branch- or mother filament-bound state and b) the lowered Pi affinity of the latter. Nonetheless, they emphasize the importance of long-lived "surviving" ADP-Pi bound complexes on the mother (even stated in the abstract). I understand that this fraction shows under some experimental conditions (BeFx), but unless I am missing something, most complexes should rapidly lose their phosphate and either exchange nucleotide or dissociate from the mother under physiological conditions. Please clarify or tone done.

      Essential; textual revision only 5. GMF mechanism. The authors claim that GMF "...accelerates the departure of the surviving Arp2/3 complex from the mother...". I assume that they infer this from decrease in the re-nucleation ratio. However, alternatively GMF could simply dwell on the complex, inhibiting re-nucleation without promoting dissociation from the mother. The authors should either monitor Arp2/3 dwell times directly to discriminate between these possibilities or be more cautious in their conclusions.

      Essential; simple experiment (a weeks time) or textual revision. 6. Cortactin mechanism and the "leash model". I must say that the cortactin data are the most puzzling part of the paper and had to reconcile with what we know from structure. I was hoping to find some of this resolved in the discussion. However, I do not understand the "leash model" in the discussion section for cortactin-mediated branch stabilization: "This would explain the observed increase in branch survival compared to the absence of cortactin. As the pulling force is increased, this rebinding mechanism becomes less efficient." According to my understanding of the data, this is opposite to what happens. Cortactin only stabilizes the labile interface at elevated forces! Some re-writing might help here.

      Essential; textual revision.

      Minor comments

      Organization:

      • I do not want to impose on how to best tell the story, but I felt that Fig1 A-D and Fig 2 A-B belong to one logical unit (nucleotide dependence), whereas Fig 1 E-F and Fig 2 C belong to the other (Pi binding and exchange). Perhaps consider re-organizing to streamline presentation?

      Semantics/Typos:

      • Abstract: „... ADP-Pi and ADP-Arp2/3 detach with the same exponential increase as a function of force...". Increase should refer to the dissociation rate, which should be added to the sentence.
      • Results page 8: "...and the majority of Arp2/3 complexes detach from the mother filament while remaining bound to the branch at the debranching time." "Branch" should likely be daughter here, as there is no branch after dissociation of either interface.
      • Results page 13: "Exposing ADP-BeFx-Arp2/3 complex branch junctions to a saturating amount of GMF...". It is strange to imply saturation, because GMF likely simply does not bind to the complex in this nucleotide state with appreciable affinity. Suggest to change to "high".
      • Discussion page 18: "Moreover, in mammalian Arp2/3, His80 in Arp3 (corresponding to His73 in mammalian actin) is not methylated, and corresponds to residue N77 in Arp3, which is also not modified." N77 likely belongs to Arp2?
      • Discussion page 19: "We showed that Pi affinity for Arp2/3 complexes at branch junctions is around 3.7 mM (Fig. 1), a value which lies within the reported 1-10 mM Pi concentration measured in the cytosol in different mammalian cell types". Notably, this is not too different from F-actin, which should be mentioned. By this measure alone, free inorganic phosphate could also directly regulate actin filament stability!

      Future interest (non essential):

      • It would be utterly exciting (but beyond current scope) to quantify how instantaneous debranching rates evolve for naturally aging branches starting from ATP-Arp2/3 complexes!

      Significance

      General assessment:

      This is a compelling and carefully executed study that delivers a clear mechanistic framework for how Arp2/3 branch junctions fail and re‑form under load. The central strength is the tight integration of state‑of‑the‑art reconstitutions with careful and original kinetic analysis. The experimental design is elegant and experiments have been carried out to a masterful standard. The figures are clear, the statistics are appropriate with some exceptions as detailed above. There are very few labs in the world that could have achieved this feat!

      A few aspects could be further strengthened, most notably the explanation and application of the "two slip bond" model as well as slightly more restraint in speculating around specific regulatory mechanisms. However, these are minor refinements that do not detract from the important contributions of the paper.

      Overall, the clearly work merits publication with high priority after revision; most requested changes are textual/analytical with very few targeted experiments, which would substantially strengthen core claims.

      Advance relative to prior literature:

      The major novel findings of the paper are already summarized above. There is some recent work done on the subject of branch mechanics by the authors (Ghasemi et al 2024, PMID: 38277459) and others (Pandit et al 2020 PMID: 32461373), but the focus of the present work is clearly unique and the there is plenty of novel insight.

      Audience and impact:

      Primary audience: specialists in cytoskeleton dynamics, in vitro reconstitution single molecule biophysics, and mechanobiochemistry. Secondary: researchers in cell motility, morphogenesis and mechanobiology, physicists working on active matter and modelers studying force producing and load-bearing biopolymer networks. The results and analysis framework should inform quantitative models of branched network turnover under load and the interpretation of regulatory factor action in vivo and in cells.

      Reviewer expertise:

      Actin dynamics; biochemical reconstitution; single molecule approaches; biophysics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • *

      Background and unknown in the field:

      This study investigates how fibroblast alignment influences the migration of intestinal epithelial cells, contributing to tissue integrity and repair. It is well established that intestinal fibroblasts are important regulators in the tissue through their ability to secrete essential paracrine factors for the epithelium. However, it is less well understood if they also play additional structural, tissue architecture instructing role and how the communication between the fibroblasts and the epithelia is regulated.

      Advance over state of the art:

      Here the authors have set-up an elegant three-component system to investigate this. They have gone beyond the recent advances of culturing intestinal and colonic organoids in 2D (in a manner that preserves- and villus-like organization) and bioengineered epithelial-stromal model comprising organoid-derived intestinal epithelial cells (IECs), primary intestinal fibroblasts, and a basement membrane matrix. Using this model, they have uncovered fibroblasts enhancing the directed and persistent migration of intestinal epithelial cells (IECs). They used scRNAseq to carefully analyse the stromal cell populations present in their co-cultures of primary mouse intestinal subepithelial fibroblasts and organoid-derived intestinal mouse epithelial cells. They observed that this reflected well the stromal cell-type composition as well as the paracrine activity previously reported for these cells in tissue. Using a clever system with Matrigel and an elastomeric barrier, the authors were able to induce non-epithelial gaps in different scenarios (IECs alone or with fibroblasts or with conditioned media) and observe the wound-closure as well as the presence of specific cell types. They observed that the epithelial monolayers showed significant gap closure when in direct contact with fibroblasts compared to controls. Interestingly, the enhanced efficiency of epithelial migration and gap closure, in the presence of fibroblasts, was independent of PGE-EP4 signaling and was not due to differences in cell proliferation. Instead, the imaging revealed that the fibroblasts were in direct contact with the epithelium. The authors observed that in the absence of fibroblasts the migration properties of cells in the villus and the crypt regions were dramatically different and the fibroblast presence was necessary to efficiently synchronize these to support gap closure. In addition, the presence of fibroblasts enhanced the directionality of the epithelial cell migration. Detailed imaging and image analyses revealed that gap closure involved activation of the fibroblasts and co-ordinated coalignment of IECs and fibroblasts. They also explored matrix deposition of the fibroblasts during the process and found that they deposited aligned ECM fibers that guide epithelial migration. Mere cell-derived matrix (devoid of live fibroblasts) was able to partially recapitulate the fibroblast-coordinated epithelial migration that the fibroblast generated matrix and its alignment are key contributors to the phenotype.

      Comments:

      This is overall a very interesting and well-written study. The imaging and the image analysis are state-of-the art and the bioengineered model is an exciting advancement over current methods developed by these researchers and others. This study meets all the criteria for a publication in the since that all the experiments seem to be carefully conducted, with appropriate controls and sufficient quantifications and statistics. The claims made by the authors are supported by the data. This is currently suitable to be published as a method/protocol and as a descriptive study uncovering interesting cross-talk and co-dependencies of epithelial and stromal cells during injury repair. There are of course aspects that could improve the study further like more mechanistic insight into the underpinnings of the direct epithelia-fibroblast interaction and its involvement in the directed IEC migration. However, these may be topics to investigate in a future study.

      • *

      Reviewer #1 (Significance (Required)):

      • *

      The strengths of the study are the highly in vivo relevant model system that is amendable to imaging and detailed image analysis of distinct cell populations. This may be adapted by others in in the field and has the potential to transform the way cell dynamics in the intestinal epithelium are visualized and investigated in vitro

      • *

      We thank the reviewer for their thoughtful and positive assessment of our work, and their recognition of the relevance of the bioengineered epithelial-stromal model and its potential for quantitative imaging and analysis of epithelial and fibroblast dynamics.

      We agree that further mechanistic insight into epithelial-fibroblast crosstalk would strengthen the study. While the current manuscript establishes this tractable system and identifies a role for fibroblast organization and matrix alignment in coordinating epithelial migration, we also aim to deepen the mechanistic understanding in the revision. As outlined in our response to Reviewer 2, we will perform additional experiments to further investigate the epithelial-fibroblast crosstalk and force-dependent interactions underlying this process.

      We believe that these additions will complement the current findings and strengthen the conceptual contribution of the study beyond its methodological advances.

      • *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • *

      Please find enclosed my review comments on the manuscript entitled "Fibroblast alignment coordinates epithelial migration and maintains intestinal tissue integrity" by Jordi Comelles et al.

      In this manuscript, the authors use a bioengineered epithelial-stromal system composed of organoid-derived intestinal epithelial cells, primary intestinal fibroblasts, and a basement membrane matrix to show that direct physical interactions between fibroblasts and epithelial cells drive a large-scale organization of the fibroblast network. This spatial reorganization, in turn, promotes persistent and oriented migration of epithelial cells, ultimately enabling restoration of the intestinal epithelium in an in vitro gap-closure assay. Overall, while the authors use an elegant in vitro model to study intestinal wound closure, and more specifically the role of fibroblasts in this context, I find this manuscript not suitable for publication in its present form. The data are overinterpreted, the novelty is limited, and the molecular mechanisms underlying WAE-fibroblast interactions are insufficiently addressed.

      • *

      We thank the reviewer for their contribution to the revision process with their valuable assessments. We will address their specific points below.

      • *

      Figure 1 - What are the units of the "fraction gap closure" shown in panels d and e? Is it expressed as a percentage?

      We thank the reviewer for pointing this out. The "fraction of gap closed" was calculated as (A(t = 0h)-A(t))/A(t = 0h), where A(t = 0h) corresponds to the initial gap area and A(t) is the area of the gap measured at the time point t. With this definition, the fraction of gap closed is dimensionless, it is 0 at the initial time point, will reach 1 if the gap is fully closed and will have negative values if the gap area increases beyond the initial size, as observed in some replicates of the control condition. To avoid misinterpretation, we will express this quantity as a percentage (i.e., multiplied by 100), as suggested by the reviewer. Moreover, we realized it was ill defined in the methods section. This will be corrected as well in the revised version.

      • *

      "Actually, epithelial monolayers achieved the most effective gap closure when cultured in direct physical contact with fibroblasts (Figure 1e and Movies 2 and 3)." From the data shown in panels c, d, and e, it appears that fibroblast-conditioned medium alone promotes efficient gap closure, comparable to the + fibroblast condition.

      We agree with the reviewer that the original closing sentence overstated the effect. While both fibroblast-conditioned medium and direct fibroblast contact promote efficient gap closure compared to control conditions, the data do not support a consistent difference between these two conditions. We will therefore remove this statement in the revised version to more accurately reflect the results.

      • *

      Figure 2 - The use of a cell proliferation inhibitor during the gap-closure assay would help determine the contribution of cell proliferation at the migration front.

      We agree with the reviewer that inhibiting proliferation would help assess the contribution of cell proliferation to gap closure. However, in the 2D gap-closure assay, our Ki67 immunostaining showed no significant differences in the proportion of proliferative cells between conditions, either within the monolayer or at the migration front. This suggests that differential proliferation is unlikely to account for the differences in gap closure observed between control and fibroblast-containing conditions.

      We note that, in a separate 3D organoid assay, fibroblast-derived signals induced a WAE-like transcriptional program associated with reduced Ki67 mRNA expression, indicating that fibroblasts can promote a more migratory epithelial state without increasing proliferation. Thus, while proliferation may contribute to epithelial homeostasis and repair, our data do not point it as the main determinant of the differences observed in the 2D gap-closure phenotypes.

      In addition, pharmacological inhibition of proliferation would likely perturb the homeostasis of the organoid-derived epithelial monolayers, in which proliferative crypt compartments are essential, and would be difficult to restrict to epithelial cells without also affecting fibroblasts in co-culture. For these reasons, although such experiments could inform the general contribution of proliferation to gap closure, we do not think they would directly clarify the differences observed between conditions in our system.

      • *

      Figure 2f and 2g - Has a dose-dependent effect of PGE2 been tested?

      We thank the reviewer for pointing this out. We did not perform a dose-response analysis of PGE2 in this study, as our aim was to assess the involvement of the PGE2-EP4 axis rather than to characterize its quantitative dynamics. We therefore selected a concentration based on previous work demonstrating dose-dependent induction of the WAE program in 3D organoid systems (Miyoshi et al., 2017). In that study, 1 µM PGE2 was sufficient to induce a significant increase in the WAE marker Cldn4, and we used this concentration as a biologically relevant reference condition. We will clarify this in the methods section.

      • *

      Figure 2i - The + fibroblast + EP4i condition (pink) is missing.

      We thank the reviewer for pointing this out. The + fibroblast + EP4i condition is present in the plot but not visually distinguishable because it overlaps with the + fibroblast condition and is therefore masked by it. As shown in Figure S4e, the + fibroblast + EP4i condition falls within the variability range of the + fibroblast condition. To improve clarity, we will revise the figure to ensure that this condition is visually identifiable.

      • *

      "This suggests a mechanical or contact-mediated role for fibroblasts in preserving epithelial integrity and promoting coordinated migration beyond their paracrine signaling." While PGE2-EP4 signaling does not appear to be involved in the fibroblast-mediated enhancement of gap-closure efficiency, the conclusion that physical interactions are more important than paracrine effects is overstated. For instance, an experimental condition in which fibroblast-conditioned medium is inactivated (boiling for 5 minutes) would strengthen this conclusion. In addition, inhibition of actomyosin contractility in fibroblasts would be informative.

      Figure 3 - The data presented here do not convincingly support the dismissal of conditioned medium as a contributing factor. The differences between the + fibroblast-conditioned medium and + fibroblast conditions are modest. In both cases, epithelial cells migrate and gaps close.

      We agree with the reviewer that inhibition of actomyosin contractility in fibroblasts would provide valuable insight into the role of force-dependent interactions in epithelial-stromal coupling. However, pharmacological inhibitors of the Rho-ROCK-myosin pathway (e.g., blebbistatin, ML-7, or the ROCK inhibitor Y-27632) would also affect epithelial contractility in our co-culture system, making it difficult to specifically attribute any observed effects to fibroblast mechanics.

      We also agree that paracrine signaling plays an important role in epithelial gap closure. Indeed, supplementation of control media with PGE improves gap closure compared to control conditions, although it does not reach the levels observed with fibroblast-conditioned medium, suggesting that additional soluble factors contribute beyond the PGE-EP4 axis. However, time-lapse imaging revealed direct and dynamic interactions between fibroblasts and epithelial cells (Movie 6; Figure S5a-d; Movie 7), which prompted us to further investigate the contribution of physical interactions, as addressed in Figure 3.

      In Figure 3, we analyzed migration at the single-cell level, in contrast to the tissue-level measurements used for gap closure quantification. In organoid-derived intestinal monolayers, two distinct compartments can be identified: crypt-like and villus-like regions. In vivo, these compartments exhibit different migration behaviors: cells in the crypt are primarily displaced due to crowding, whereas cells in the villus actively migrate, as suggested by the presence of cryptic lamellipodia (Krndija et al., 2019). Consistent with this, tracking individual cells revealed that crypt cells are largely static, while villus cells migrate toward the gap. This compartmentalized behavior was observed in both control and fibroblast-conditioned medium conditions. Strikingly, in the presence of fibroblasts, this differential behavior was reduced, resulting in coordinated migration of both crypt and villus regions.

      This mismatch between compartments in control conditions may contribute to the appearance of discontinuities ("holes") within the epithelial layer during migration. In control experiments, these defects failed to close, whereas in conditioned medium they closed slowly or incompletely. In contrast, in the presence of fibroblasts, these disruptions were rapidly and efficiently resolved, indicating improved tissue integrity.

      Additionally, analysis of individual trajectories near the migration front showed that cells exhibit significantly increased directional persistence (i.e., movement aligned with the direction of gap closure) in the presence of fibroblasts compared to conditioned medium alone.

      Taken together, while paracrine signaling from fibroblasts contributes to epithelial migration and gap closure, the physical presence of fibroblasts induces qualitative changes in epithelial behavior, including coordinated migration across compartments, improved hole closure, and enhanced directional persistence.

      • *

      Figure 4a - "Upon removal of the barrier (t = 0 h), fibroblasts at the epithelial front were small and evenly distributed, with no prominent α-SMA fibers present." Here, fibroblasts are α-SMA positive but not elongated. α-SMA may therefore not be the most appropriate marker. What are the levels of phosphorylated MLC2? These may increase during wound closure. Also, fibroblasts culture promotes aSMA expression, therefore, it may be possible that the fibroblasts used in this assay may not represent the healthy fibroblasts found in vivo.

      We agree with the reviewer that fibroblasts are α-SMA positive at early time points but are not yet elongated. In our system, we observe that α-SMA is already present at t = 0 h, while fibroblasts progressively elongate and reorganize α-SMA into prominent fiber structures over time. This suggests that changes in α-SMA organization, rather than its initial presence, are associated with fibroblast activation during gap closure.

      We note that baseline α-SMA expression may be influenced by in vitro culture conditions prior to the assay, which could differ from the state of fibroblasts in vivo. We will clarify this point in the Discussion to better contextualize our observations relative to native fibroblast populations.

      In addition, we agree that assessing phosphorylated myosin light chain 2 (pMLC2) levels would provide complementary information on contractile activity. We will therefore perform pMLC2 staining, as suggested, to further evaluate force generation by fibroblasts during the wound closure process.

      • *

      Figure 5 - Fibroblast alignment could also result from paracrine signals secreted by epithelial cells. This possibility should be tested.

      We thank the reviewer for this suggestion. To test whether fibroblast alignment could be driven by epithelial-derived paracrine signals, we will culture fibroblasts in conditioned medium collected from epithelial monolayers undergoing gap closure (control condition without fibroblasts) and quantify their alignment over time. This will be compared to fibroblasts maintained in standard fibroblast medium.

      This experiment will directly assess whether epithelial-derived soluble factors are sufficient to induce fibroblast alignment, or whether direct physical interactions are required.

      • *

      In summary, this manuscript demonstrates that epithelial cells migrate more efficiently on extracellular matrix proteins deposited and oriented by fibroblasts. This concept is not novel. Identifying the molecular mechanisms governing interactions between WAE and subepithelial fibroblasts would significantly enhance the novelty and impact of this study.

      • *

      Reviewer #2 (Significance (Required)):

      • *

      In this manuscript, the authors use a bioengineered epithelial-stromal system composed of organoid-derived intestinal epithelial cells, primary intestinal fibroblasts, and a basement membrane matrix to show that direct physical interactions between fibroblasts and epithelial cells drive a large-scale organization of the fibroblast network. This spatial reorganization, in turn, promotes persistent and oriented migration of epithelial cells, ultimately enabling restoration of the intestinal epithelium in an in vitro gap-closure assay. Overall, while the authors use an elegant in vitro model to study intestinal wound closure, and more specifically the role of fibroblasts in this context, I find this manuscript not suitable for publication in its present form. The data are overinterpreted, the novelty is limited, and the molecular mechanisms underlying WAE-fibroblast interactions are insufficiently addressed.

      *We thank the reviewer for this thorough and critical assessment. We have clarified the overstatements in the rebuttal and we will modify the text to address concerns regarding overinterpretation and clearly acknowledge the limitations of our approach. In particular, we will refine the framing of the study to better distinguish between the contributions of paracrine signaling and physical epithelial-stromal interactions. *

      *To address the reviewer's concerns regarding mechanism and novelty, we will perform additional experiments aimed at further characterizing epithelial-stromal cross-talk, and experiments to assess fibroblast contractility and its contribution to epithelial coordination. *

      We believe that these revisions and proposed experiments will strengthen the manuscript and clarify its conceptual contribution.

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • *

      Summary:

      - Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The study by Comelles et al. focuses on how primary intestinal fibroblasts contribute to organoid-derived intestinal epithelial migration in wound healing assays. Using fibroblast-epithelial co-cultures in a 2D in vitro gap closure system, the authors found that direct interaction with fibroblasts drives cohesive and directed migration of intestinal epithelia toward the gap. They further propose that long-range fibroblast alignment promotes the deposition of extracellular matrix (ECM) proteins in an oriented fashion, contributing to directed epithelial migration.

      Major comments:

      - Are the key conclusions convincing?

      Some of the key conclusions of this manuscript are not entirely convincing given the available data. The manuscript would benefit from additional evidence and/or clarifications to support their conclusions. See comments below.

      • *

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      (Fig 4a) The authors claim that fibroblasts become activated during gap closure as evidenced by the enhanced assembly of a-SMA fibers 24 hours following barrier removal. Yet, long a-SMA fibers are also observed when fibroblasts are cultured in the absence of epithelial cells or barrier removal (Fig. S1b). To support this conclusion, the authors should consider including additional controls to account for potential time-dependent assembly of a-SMA fibers (e.g., fibroblast-only control).

      We thank the reviewer for pointing this out. We agree that a fibroblast-only control would be important to account for potential time-dependent assembly of α-SMA fibers. We will therefore perform additional experiments monitoring α-SMA organization in fibroblasts cultured alone over time, which will allow us to better interpret the dynamics observed in the co-culture conditions.

      • *

      (Fig. 5a) The authors conclude that fibroblasts align parallel to the direction of epithelial migration during gap closure. While quantifications are convincing, again, a fibroblast-only control accounting for time-dependent spreading and elongation (as seen in Fig. S1) is missing. Including such a control would strengthen their claim that alignment is specific to the gap closure context rather than a time-dependent phenotype.

      We agree with the reviewer that, given the intrinsic ability of fibroblasts to form ordered domains with long-range alignment, this control would be highly informative. We will therefore quantify fibroblast alignment over time in fibroblast-only cultures, which will allow us to determine to what extent the long-range organization observed in co-culture is specific to the gap closure context.

      • *

      (Fig 6) The authors claim that fibroblast-derived aligned ECM drives directional epithelial migration. While fibronectin fibers appear scarce and weakly aligned with the direction of migration, laminin and type IV collagen fibers are barely detectable (Fig. 6f). This may reflect a defect in ECM deposition rather than fiber alignment, which contrasts with Fig. S1, where fibroblasts are shown to deposit and assemble laminin and type IV collagen fibers. One possible explanation is that primary fibroblasts were not cultured long enough to allow robust ECM deposition. Alternatively, the observed effect may be specific to fibronectin, which is consistent with fibroblasts being its major source. The authors should revise their interpretation or provide additional evidence to support their current claim.

      We thank the reviewer for this important point. We agree that differences in ECM signal within the gap may reflect not only fiber alignment but also differences in the amount of protein deposited. In the +fibroblast condition, fibroblasts in the gap have more time to secrete ECM compared to the "empty gap" condition, where fibroblasts remain confined beneath the epithelium.

      In addition, the presence of Matrigel likely masks the contribution of certain ECM components, making laminin or type IV collagen more apparent than fibronectin. We will therefore revise the interpretation of these results to explicitly acknowledge the contribution of ECM abundance in addition to alignment.

      • *

      (Fig 6i) The authors propose that the presence of ECM alone within the gap enhances epithelial gap closure compared to empty gap conditions, although gap closure remains less effective than in the presence of primary fibroblasts. From the figure legend and methods, it seems that the decellularized ECM condition is generated using NIH-3T3 fibroblasts cultured for 8 days, whereas the other conditions used primary fibroblasts cultured for 1 day (Fig. 6a-h). This comparison is confounded by differences in cell source and ECM deposition time. If I am misunderstanding this, please clarify, otherwise consider repeating the decellularized ECM condition using primary fibroblasts and matching culture times for a fair comparison. Along these lines, please include images showing that ECM fibers remain intact following decellularization.

      We thank the reviewer for this suggestion. We will include additional staining to confirm that ECM fibers remain intact after decellularization in the revised version.

      Regarding the use of NIH-3T3 fibroblasts for CDM generation, this choice was made to minimize potential residual paracrine signaling from primary intestinal fibroblasts after decellularization. We acknowledge that this introduces differences in cell source.

      Concerning culture time, we followed established protocols for CDM formation, which recommend extended culture periods ({greater than or equal to}8 days) to allow robust ECM deposition (Cukierman et al., 2001; Franco-Barraza et al., 2016; Godeau et al., 2020). We will clarify these points in the revised manuscript and discuss the limitations associated with these differences.

      • *

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Yes. The additional experiments outlined above would help support the current conclusions of the manuscript, rather than to explore new directions beyond its scope.

      • *

      - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes, the additional experiments primarily involve the inclusion of controls and additional immunofluorescence imaging to their existing experimental setups. They should be relatively straightforward to implement (~2-3 months).

      • *

      - Are the data and the methods presented in such a way that they can be reproduced?

      Yes.

      • *

      - Are the experiments adequately replicated and statistical analysis adequate?

      Overall, yes. But some plot legends should specify the number of replicates analyzed (e.g. Fig. 2b, Fig. 2d, Fig. 3h).

      We will review and correct these issues.

      • *

      Minor comments:

      - Specific experimental issues that are easily addressable.

      (Fig. 1c-e) The authors state that intestinal epithelial monolayers exhibit the most effective gap closure when in direct contact with fibroblasts. However, fibroblast-conditioned media and co-cultures show comparable gap closure efficiencies (Fig. 1e). The authors should consider revising this interpretation based on the provided data.

      We thank the reviewer for pointing this out, which was also raised by Reviewer 2. As discussed above, we agree that the original statement overstated the effect. Both fibroblast-conditioned medium and direct fibroblast contact promote efficient gap closure compared to control conditions, and we will revise the text accordingly to reflect that no consistent quantitative difference is observed between these two conditions.

      • *

      (Fig. 3b) The authors suggest that crypt-like epithelial cells undergo migration when grown on fibroblasts, but not in conditioned media alone. This is interesting, but it is not clear how they identify crypt-like cells for tracking. The authors should clarify if crypt-like cells are defined based on markers or inferred from their morphology.

      We thank the reviewer for this comment. In these tracking analyses, crypt-like cells were identified based on morphology. As shown in Figure S3 and in Larrañaga et al., 2025, crypt-like cells, defined by specific molecular markers, are significantly smaller than villus-like cells and form high-density regions. These features allow their identification based on morphology in fluorescently labeled monolayers. We will clarify this criterion in the Methods section of the revised manuscript.

      • *

      (Fig 3f-h) The authors conclude that fibroblasts promote directed epithelial cell motility based on cell trajectory analysis. Although they state that this analysis is performed on epithelial monolayers, their tdTomato epithelial population appears sparse in some conditions (control and conditioned media; Fig. S6a). Such variability in cell density may bias measurements of migration directionality at the cell-level, unless a mixed population is being used for tracking. The authors should clarify whether this analysis was indeed conducted on confluent monolayers.

      We thank the reviewer for this comment. For trajectory analysis, we used a mixed population of tdTomato-positive and non-fluorescent epithelial cells in some experiments to facilitate individual cell tracking. Importantly, epithelial monolayers were confluent in all conditions analyzed. We will clarify this in the Methods section.

      • *

      (Fig 6b) Their gap closure experimental setup indicates that fibroblasts are cultured on a Matrigel-coated surface, which should already contain abundant laminin and type IV collagen. Thus, it is unclear why type IV collagen is not detected underneath fibroblasts. The authors should explain why this is the case for clarity.

      We thank the reviewer for pointing out this observation. Indeed, fibroblasts are cultured on a Matrigel-coated surface which contains laminin and collagen type IV among many other components. We observed thick collagen-rich structures between the fibroblasts and the epithelia that we atributed, not only to fibroblasts' secreted collagen, but also a rearrengement of the collagen available in the coated surface. We will clarify this in the discussion of the revised version for clarity.

      • *

      - Are prior studies referenced appropriately?

      Yes

      • *

      - Are the text and figures clear and accurate?

      Mostly. Figures 6d and 6g seem to be duplicated by mistake.

      We thank the reviewer for noting this. We will correct this mistake.

      • *

      - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      There are some missing frames in Movie 2. If they are not available, it's okay to include black frames, so that the sequence remains consistent with the timestamps.

      The authors may consider using asterisks as significance indicators instead of reporting precise p-values directly on their plots. Having this format would facilitate visual comparison of statistical significance across conditions.

      Displaying single channels of experiments where co-cultures are used would help to better interpret their data.

      We thank the reviewer for pointing out these issues and for their valuable suggestions. We will correct the errors in the movie and improve the presentation as suggested where possible.

      • *

      Reviewer #3 (Significance (Required)):

      • *

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This study provides a valuable contribution to understanding how fibroblasts influence intestinal epithelial migration. The main advance lies in the use of a co-culture system combining organoid-derived intestinal epithelial cells that assemble into a crypt-villus organization with primary intestinal fibroblasts in a 2D gap closure system. This approach allows the authors to examine epithelial-fibroblast interactions in a more physiologically relevant context compared to prior work.

      We thank the reviewer for their positive assessment of the significance of our work.

      • *

      - Place the work in the context of the existing literature (provide references, where appropriate).

      Addressed above.

      • *

      - State what audience might be interested in and influenced by the reported findings.

      Cell and developmental biology, extracellular matrix biology, tissue regeneration.

      • *

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Tissue morphogenesis, cell motility, extracellular matrix dynamics.

      We thank the reviewer for their positive assessment and for their suggestions to improve the manuscript.

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      • Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The study by Comelles et al. focuses on how primary intestinal fibroblasts contribute to organoid-derived intestinal epithelial migration in wound healing assays. Using fibroblast-epithelial co-cultures in a 2D in vitro gap closure system, the authors found that direct interaction with fibroblasts drives cohesive and directed migration of intestinal epithelia toward the gap. They further propose that long-range fibroblast alignment promotes the deposition of extracellular matrix (ECM) proteins in an oriented fashion, contributing to directed epithelial migration.

      Major comments:

      • Are the key conclusions convincing?

      Some of the key conclusions of this manuscript are not entirely convincing given the available data. The manuscript would benefit from additional evidence and/or clarifications to support their conclusions. See comments below. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      (Fig 4a) The authors claim that fibroblasts become activated during gap closure as evidenced by the enhanced assembly of a-SMA fibers 24 hours following barrier removal. Yet, long a-SMA fibers are also observed when fibroblasts are cultured in the absence of epithelial cells or barrier removal (Fig. S1b). To support this conclusion, the authors should consider including additional controls to account for potential time-dependent assembly of a-SMA fibers (e.g., fibroblast-only control). (Fig. 5a) The authors conclude that fibroblasts align parallel to the direction of epithelial migration during gap closure. While quantifications are convincing, again, a fibroblast-only control accounting for time-dependent spreading and elongation (as seen in Fig. S1) is missing. Including such a control would strengthen their claim that alignment is specific to the gap closure context rather than a time-dependent phenotype. (Fig 6) The authors claim that fibroblast-derived aligned ECM drives directional epithelial migration. While fibronectin fibers appear scarce and weakly aligned with the direction of migration, laminin and type IV collagen fibers are barely detectable (Fig. 6f). This may reflect a defect in ECM deposition rather than fiber alignment, which contrasts with Fig. S1, where fibroblasts are shown to deposit and assemble laminin and type IV collagen fibers. One possible explanation is that primary fibroblasts were not cultured long enough to allow robust ECM deposition. Alternatively, the observed effect may be specific to fibronectin, which is consistent with fibroblasts being its major source. The authors should revise their interpretation or provide additional evidence to support their current claim. (Fig 6i) The authors propose that the presence of ECM alone within the gap enhances epithelial gap closure compared to empty gap conditions, although gap closure remains less effective than in the presence of primary fibroblasts. From the figure legend and methods, it seems that the decellularized ECM condition is generated using NIH-3T3 fibroblasts cultured for 8 days, whereas the other conditions used primary fibroblasts cultured for 1 day (Fig. 6a-h). This comparison is confounded by differences in cell source and ECM deposition time. If I am misunderstanding this, please clarify, otherwise consider repeating the decellularized ECM condition using primary fibroblasts and matching culture times for a fair comparison. Along these lines, please include images showing that ECM fibers remain intact following decellularization. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Yes. The additional experiments outlined above would help support the current conclusions of the manuscript, rather than to explore new directions beyond its scope. - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes, the additional experiments primarily involve the inclusion of controls and additional immunofluorescence imaging to their existing experimental setups. They should be relatively straightforward to implement (~2-3 months). - Are the data and the methods presented in such a way that they can be reproduced?

      Yes. - Are the experiments adequately replicated and statistical analysis adequate?

      Overall, yes. But some plot legends should specify the number of replicates analyzed (e.g. Fig. 2b, Fig. 2d, Fig. 3h).

      Minor comments:

      • Specific experimental issues that are easily addressable.

      (Fig. 1c-e) The authors state that intestinal epithelial monolayers exhibit the most effective gap closure when in direct contact with fibroblasts. However, fibroblast-conditioned media and co-cultures show comparable gap closure efficiencies (Fig. 1e). The authors should consider revising this interpretation based on the provided data. (Fig. 3b) The authors suggest that crypt-like epithelial cells undergo migration when grown on fibroblasts, but not in conditioned media alone. This is interesting, but it is not clear how they identify crypt-like cells for tracking. The authors should clarify if crypt-like cells are defined based on markers or inferred from their morphology. (Fig 3f-h) The authors conclude that fibroblasts promote directed epithelial cell motility based on cell trajectory analysis. Although they state that this analysis is performed on epithelial monolayers, their tdTomato epithelial population appears sparse in some conditions (control and conditioned media; Fig. S6a). Such variability in cell density may bias measurements of migration directionality at the cell-level, unless a mixed population is being used for tracking. The authors should clarify whether this analysis was indeed conducted on confluent monolayers. (Fig 6b) Their gap closure experimental setup indicates that fibroblasts are cultured on a Matrigel-coated surface, which should already contain abundant laminin and type IV collagen. Thus, it is unclear why type IV collagen is not detected underneath fibroblasts. The authors should explain why this is the case for clarity. - Are prior studies referenced appropriately?

      Yes - Are the text and figures clear and accurate?

      Mostly. Figures 6d and 6g seem to be duplicated by mistake. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      There are some missing frames in Movie 2. If they are not available, it's okay to include black frames, so that the sequence remains consistent with the timestamps. The authors may consider using asterisks as significance indicators instead of reporting precise p-values directly on their plots. Having this format would facilitate visual comparison of statistical significance across conditions. Displaying single channels of experiments where co-cultures are used would help to better interpret their data.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This study provides a valuable contribution to understanding how fibroblasts influence intestinal epithelial migration. The main advance lies in the use of a co-culture system combining organoid-derived intestinal epithelial cells that assemble into a crypt-villus organization with primary intestinal fibroblasts in a 2D gap closure system. This approach allows the authors to examine epithelial-fibroblast interactions in a more physiologically relevant context compared to prior work. - Place the work in the context of the existing literature (provide references, where appropriate). Addressed above.

      • State what audience might be interested in and influenced by the reported findings.

      Cell and developmental biology, extracellular matrix biology, tissue regeneration. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Tissue morphogenesis, cell motility, extracellular matrix dynamics.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Please find enclosed my review comments on the manuscript entitled "Fibroblast alignment coordinates epithelial migration and maintains intestinal tissue integrity" by Jordi Comelles et al. In this manuscript, the authors use a bioengineered epithelial-stromal system composed of organoid-derived intestinal epithelial cells, primary intestinal fibroblasts, and a basement membrane matrix to show that direct physical interactions between fibroblasts and epithelial cells drive a large-scale organization of the fibroblast network. This spatial reorganization, in turn, promotes persistent and oriented migration of epithelial cells, ultimately enabling restoration of the intestinal epithelium in an in vitro gap-closure assay. Overall, while the authors use an elegant in vitro model to study intestinal wound closure, and more specifically the role of fibroblasts in this context, I find this manuscript not suitable for publication in its present form. The data are overinterpreted, the novelty is limited, and the molecular mechanisms underlying WAE-fibroblast interactions are insufficiently addressed.

      Figure 1 - What are the units of the "fraction gap closure" shown in panels d and e? Is it expressed as a percentage? "Actually, epithelial monolayers achieved the most effective gap closure when cultured in direct physical contact with fibroblasts (Figure 1e and Movies 2 and 3)." From the data shown in panels c, d, and e, it appears that fibroblast-conditioned medium alone promotes efficient gap closure, comparable to the + fibroblast condition. Figure 2 - The use of a cell proliferation inhibitor during the gap-closure assay would help determine the contribution of cell proliferation at the migration front. Figure 2f and 2g - Has a dose-dependent effect of PGE2 been tested? Figure 2i - The + fibroblast + EP4i condition (pink) is missing. "This suggests a mechanical or contact-mediated role for fibroblasts in preserving epithelial integrity and promoting coordinated migration beyond their paracrine signaling." While PGE2-EP4 signaling does not appear to be involved in the fibroblast-mediated enhancement of gap-closure efficiency, the conclusion that physical interactions are more important than paracrine effects is overstated. For instance, an experimental condition in which fibroblast-conditioned medium is inactivated (boiling for 5 minutes) would strengthen this conclusion. In addition, inhibition of actomyosin contractility in fibroblasts would be informative. Figure 3 - The data presented here do not convincingly support the dismissal of conditioned medium as a contributing factor. The differences between the + fibroblast-conditioned medium and + fibroblast conditions are modest. In both cases, epithelial cells migrate and gaps close. Figure 4a - "Upon removal of the barrier (t = 0 h), fibroblasts at the epithelial front were small and evenly distributed, with no prominent α-SMA fibers present." Here, fibroblasts are α-SMA positive but not elongated. α-SMA may therefore not be the most appropriate marker. What are the levels of phosphorylated MLC2? These may increase during wound closure. Also, fibroblasts culture promotes aSMA expression, therefore, it may be possible that the fibroblasts used in this assay may not represent the healthy fibroblasts found in vivo. Figure 5 - Fibroblast alignment could also result from paracrine signals secreted by epithelial cells. This possibility should be tested. In summary, this manuscript demonstrates that epithelial cells migrate more efficiently on extracellular matrix proteins deposited and oriented by fibroblasts. This concept is not novel. Identifying the molecular mechanisms governing interactions between WAE and subepithelial fibroblasts would significantly enhance the novelty and impact of this study.

      Significance

      In this manuscript, the authors use a bioengineered epithelial-stromal system composed of organoid-derived intestinal epithelial cells, primary intestinal fibroblasts, and a basement membrane matrix to show that direct physical interactions between fibroblasts and epithelial cells drive a large-scale organization of the fibroblast network. This spatial reorganization, in turn, promotes persistent and oriented migration of epithelial cells, ultimately enabling restoration of the intestinal epithelium in an in vitro gap-closure assay. Overall, while the authors use an elegant in vitro model to study intestinal wound closure, and more specifically the role of fibroblasts in this context, I find this manuscript not suitable for publication in its present form. The data are overinterpreted, the novelty is limited, and the molecular mechanisms underlying WAE-fibroblast interactions are insufficiently addressed.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Background and unknown in the field:

      This study investigates how fibroblast alignment influences the migration of intestinal epithelial cells, contributing to tissue integrity and repair. It is well established that intestinal fibroblasts are important regulators in the tissue through their ability to secrete essential paracrine factors for the epithelium. However, it is less well understood if they also play additional structural, tissue architecture instructing role and how the communication between the fibroblasts and the epithelia is regulated.

      Advance over state of the art:

      Here the authors have set-up an elegant three-component system to investigate this. They have gone beyond the recent advances of culturing intestinal and colonic organoids in 2D (in a manner that preserves- and villus-like organization) and bioengineered epithelial-stromal model comprising organoid-derived intestinal epithelial cells (IECs), primary intestinal fibroblasts, and a basement membrane matrix. Using this model, they have uncovered fibroblasts enhancing the directed and persistent migration of intestinal epithelial cells (IECs). They used scRNAseq to carefully analyse the stromal cell populations present in their co-cultures of primary mouse intestinal subepithelial fibroblasts and organoid-derived intestinal mouse epithelial cells. They observed that this reflected well the stromal cell-type composition as well as the paracrine activity previously reported for these cells in tissue. Using a clever system with Matrigel and an elastomeric barrier, the authors were able to induce non-epithelial gaps in different scenarios (IECs alone or with fibroblasts or with conditioned media) and observe the wound-closure as well as the presence of specific cell types. They observed that the epithelial monolayers showed significant gap closure when in direct contact with fibroblasts compared to controls. Interestingly, the enhanced efficiency of epithelial migration and gap closure, in the presence of fibroblasts, was independent of PGE₂-EP4 signaling and was not due to differences in cell proliferation. Instead, the imaging revealed that the fibroblasts were in direct contact with the epithelium. The authors observed that in the absence of fibroblasts the migration properties of cells in the villus and the crypt regions were dramatically different and the fibroblast presence was necessary to efficiently synchronize these to support gap closure. In addition, the presence of fibroblasts enhanced the directionality of the epithelial cell migration. Detailed imaging and image analyses revealed that gap closure involved activation of the fibroblasts and co-ordinated coalignment of IECs and fibroblasts. They also explored matrix deposition of the fibroblasts during the process and found that they deposited aligned ECM fibers that guide epithelial migration. Mere cell-derived matrix (devoid of live fibroblasts) was able to partially recapitulate the fibroblast-coordinated epithelial migration that the fibroblast generated matrix and its alignment are key contributors to the phenotype.

      Comments:

      This is overall a very interesting and well-written study. The imaging and the image analysis are state-of-the art and the bioengineered model is an exciting advancement over current methods developed by these researchers and others. This study meets all the criteria for a publication in the since that all the experiments seem to be carefully conducted, with appropriate controls and sufficient quantifications and statistics. The claims made by the authors are supported by the data. This is currently suitable to be published as a method/protocol and as a descriptive study uncovering interesting cross-talk and co-dependencies of epithelial and stromal cells during injury repair. There are of course aspects that could improve the study further like more mechanistic insight into the underpinnings of the direct epithelia-fibroblast interaction and its involvement in the directed IEC migration. However, these may be topics to investigate in a future study.

      Significance

      The strengths of the study are the highly in vivo relevant model system that is amendable to imaging and detailed image analysis of distinct cell populations. This may be adapted by others in in the field and has the potential to transform the way cell dynamics in the intestinal epithelium are visualized and investigated in vitro

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03319

      Corresponding author(s): Pedro Milanez-Almeida

      1. General Statements

      We thank the reviewers for their careful and constructive evaluation. We agree that the original manuscript did not communicate the conceptual scope, novelty, or limitations of the framework as clearly as it should have. In response, we substantially revised both the presentation and the supporting analyses.

      The central revision reframes tinydenseR around a common landmark-by-sample fuzzy density matrix derived from UMAP-based cell–landmark connection strengths. This representation supports four downstream analysis modes: (1) landmark-level differential density modeling, (2) supervised quantitative sample embedding via partial-effect principal component projection (pePC), (3) density-contrast-aligned feature exploration via graph-diffused partial least squares decomposition (plsD), and (4) connection-strength-weighted pseudobulk differential expression. A dedicated algorithm overview ("The tinydenseR Algorithm") and revised Figure 1 make this structure explicit, and the revised manuscript clarifies which components are novel and which remain conventional once subsets are defined.

      To strengthen empirical support, we expanded synthetic and permutation benchmarks, added a landmark- versus cell-level integration comparison, updated miloR benchmarking to use graph refinement, and included analysis of the publicly available COMBAT COVID-19 PBMC dataset. The Methods now correct the terminology for UMAP-derived weights (connection strengths, not probabilities), make the landmark allocation rule explicit, and clarify assumptions underlying the density matrix construction.

      We do not claim that the revision resolves every benchmarking or sensitivity question raised. The goal was to make the claims more precise, the framework more transparent, and the empirical evidence more directly aligned with the revised conceptual framing.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Overall, this manuscript presents a landmark-based strategy for scaling sample-level differential abundance and differential expression analyses to atlas-sized single-cell datasets by summarizing cells through fuzzy cell-landmark memberships and then applying standard sample-level modeling. The approach is promising for computational efficiency, but several key methodological claims and design choices would benefit from clearer justification and stronger empirical validation.

      We appreciate the reviewer's careful summary and recognition of the promise of landmark-based strategies for atlas-scale analysis. The specific concerns are addressed point by point below; the revision addresses them through expanded benchmarking, clearer methodological justification, and more precise scoping of the framework's contributions and limitations.

      The manuscript's main methodological novelty appears to be computational/scalable representation learning via landmarking and fuzzy cell-landmark densities. However, for differential expression within cell states, the inferential procedure is standard pseudo-bulk + limma/voom and still relies on discrete subset definitions (clusters/cell types, or landmark bins) and downstream annotation for biological interpretation. The authors should clarify that "clustering-independent" primarily applies to the DA representation/testing, and strengthen evidence that landmark-level subsets provide materially different biological resolution than simply clustering a subsample.

      Response:

      This comment raises a central question about the scope of "clustering independence" in the framework. We want to be direct: the revised manuscript does not include a dedicated head-to-head benchmark comparing tinydenseR landmark-level subsets with a "cluster a subsample" workflow. We recognize this as a limitation of the current revision.

      What the revision does instead is twofold. First, it sharpens the conceptual distinction. A subsampled clustering still imposes a hard partition and remains sensitive to the choice of clustering algorithm, graph construction, and resolution parameter. The landmark-density framework avoids that dependence: the same continuous representation supports differential density modeling, pePC, plsD, and pseudobulk DE without requiring a rigid partition at any stage. We added a sentence in the Introduction noting that structure-aware approaches can better preserve data geometry, rare populations, and local neighborhood structure than naive random subsampling.

      Second, the revision strengthens empirical support within the landmark-density framework itself. The synthetic benchmarks demonstrate recovery of simulated trajectory-associated density shifts, abundance perturbations, and activation-associated expression changes directly in the landmark-density space. The real-data applications now present findings through landmark-level density contrasts, pePC, and plsD, with results interpreted as consistent with prior biological knowledge and published reports.

      We also clarify in the revised manuscript that the pseudobulk aggregation itself now derives subsets from the fuzzy landmark topology rather than from a rigid partition, even though the formal gene-level testing remains conventional once a subset has been specified. Thus, clustering independence applies not only to density inference but also to effect-specific sample embedding and exploratory feature interpretation.

      We therefore frame the contribution as a conceptual and empirical argument for the clustering-free workflow, while acknowledging that a direct empirical comparison with a cluster-a-subsample alternative has not been performed in this revision.

      The method treats UMAP fuzzy graph weights as "connection probabilities" and uses their sums to estimate sample-level abundance around each landmark. Please clarify (i) how sensitive the density matrix is to the UMAP membership construction (e.g., choice of parameters, the per-cell membership mass constraint), and (ii) why this is an appropriate abundance surrogate in the context of sample-level inference.

      Response:

      This is an important comment about both the interpretation and parameter-dependence of the UMAP-derived fuzzy graph. We revised the manuscript on both fronts.

      First, we no longer refer to the UMAP fuzzy graph weights as "connection probabilities." The revised Methods (Step 4) now define the directed neighborhood memberships, the local connectivity and scaling terms, and the symmetrized UMAP fuzzy-union weight explicitly, describing the resulting quantities as fuzzy graph connection strengths / affinities rather than calibrated probabilities.

      Second, we now state explicitly that the cell–landmark weights depend on the choice of nearest-landmark count, the per-cell membership-mass constraint, and the local scaling parameters. These choices are held fixed within each analysis, and the resulting landmark-by-sample matrix should be interpreted as a relative measure of enrichment/depletion around landmark neighborhoods rather than as a physical cell count. We did not add a dedicated sensitivity benchmark across UMAP hyperparameters in this revision; given the scope of the other benchmarking additions, we judged this to be better addressed as a focused follow-up study and have noted the dependence on these construction choices explicitly in the Methods. We also note that the consistent behavior of the framework across the diverse datasets analyzed in this work—spanning synthetic, cytometry, and multiple scRNA-seq settings—provides empirical evidence of robustness to parameter choice, as stated in the revised Methods.

      Third, we clarify why summing connection strengths is an appropriate abundance surrogate for sample-level inference. In the revised Methods, the unnormalized quantity around each landmark is defined as the sum of cell–landmark fuzzy weights within a sample, followed by sample-size normalization and log-transformation. The text now explains this both operationally (a continuous measure of how much cell mass from a given sample concentrates near a landmark neighborhood) and conceptually (a soft-assignment abundance surrogate internally consistent with the manifold-based representation used throughout the framework).

      These revisions address the concern by making two points explicit: (i) the UMAP-derived quantities are fuzzy graph connection strengths rather than calibrated probabilities, and (ii) their sums serve as a continuous, size-normalized abundance surrogate aligned with the landmark-based representation on which downstream inference is performed.

      The landmark selection step caps landmarks at 10% of cells per sample (and an overall cap of 5,000 landmarks by default), but the rationale for these thresholds is not clearly justified. Additionally, interaction between the per-sample cap and the global 5,000 landmark cap: when the global cap is binding, how are landmarks redistributed across samples, and does this induce unequal representation across samples with different cell yields?

      Response:

      We agree that the rationale for the default landmark caps and the interaction between the per-sample and global limits should be stated more explicitly. We revised Methods Step 2 (Landmark Selection) accordingly.

      The revised manuscript states that the default caps balance approximate proportional representation across samples with bounded computation. The target number of landmarks is defined from the per-sample sampling proportion (p = 0.1 by default) together with the overall cap of 5,000 landmarks, so that landmark sampling remains approximately proportional to sample cell yield unless the global budget becomes limiting.

      We also clarify how the per-sample and global caps interact. When the global cap is binding, each sample is subject to the same per-sample upper bound given by an equal-share maximum of the global target, while smaller samples remain limited by their own proportional cap:

      where is the number of cells in sample , is the sampling proportion, , and is the number of samples. This means that no single high-yield sample can dominate the landmark set, while smaller samples contribute only up to their own proportional allocation.

      The revised manuscript also notes the consequence for unequal sample yields: because small samples remain limited by , the total number of selected landmarks can be less than the nominal global budget if many samples are small. We therefore frame the rule as a transparent compromise that limits over-representation of high-yield samples while preserving simple approximately proportional design. For additional transparency, is recorded in the metadata for inspection, users can adjust p to match replicate structure and heterogeneity, and the software warns when sample cell yields differ by more than 10-fold.

      Reviewer #1 (Significance (Required)):

      This manuscript introduces a practical landmark-based framework that makes sample-level differential abundance and expression analyses feasible at atlas scale by summarizing large cell collections through a reduced set of representative cells and fuzzy cell-landmark memberships, then leveraging well-established sample-level linear modeling. The main strengths are its emphasis on scalability, a clear engineering workflow that can be applied across large studies, and a unified representation that can support multiple downstream analyses. The main limitations are that key inferential components remain conventional once subsets are defined, biological interpretation still relies on clustering/label transfer and downstream annotation, and several core design choices (e.g., treating UMAP fuzzy weights as "probabilities" and the default landmark caps) would benefit from stronger justification and sensitivity analyses to support the paper's broader claims about cluster-independence and biological resolution.

      Response:

      We appreciate this balanced assessment. The revised manuscript addresses these concerns through the changes detailed in our preceding responses: the framework is now presented around the common landmark-density representation, with explicit scoping of which components are novel (the density matrix, pePC) and which remain conventional (pseudobulk DE after subset selection). The UMAP-derived weights are described as connection strengths rather than probabilities, the landmark caps are justified more explicitly, and the empirical support has been expanded through synthetic, permutation, integration, and real-data analyses.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors describe a new computational framework for the analysis of single-cell RNA-seq data in multiple domains that uses probabilistic cluster assignment for downstream differential expression and abundance analysis. They show that their method outperforms similar methods in certain tasks, and provide computational performance studies to show that this is tractable on large scale RNA-seq datasets.

      I congratulate the authors on a well written and interesting manuscript, and the development of what seems like quite a comprehensive analysis package. I particularly appreciate the effort to demonstrate computational performance on large datasets (and of course the great name!). Nevertheless, I have a few minor comments that I think would enhance the presentation and impact of the paper.

      We are grateful for Reviewer 2's generous assessment and constructive comments, which helped us improve the presentation, terminology, and readability of the manuscript. The responses below address each point in detail.

      From the introduction, it's a little hard to determine what exactly tinydenseR is doing, what the underlying philosophy is, and what the key innovations are. This is especially apparent as the paper then launches straight into the results. The methods section is very comprehensive, however without a 'guide' through this that bridges the gap between the very high level description and the implementation details, a reader could end up getting stuck. So my suggestion would be to add a couple of paragraphs to the introduction that provide this, such that the reader would only need to consult the methods for the real implementation details.

      Response:

      We agree that the original manuscript moved too quickly from high-level motivation into application results without providing enough conceptual guidance.

      We addressed this in two complementary ways. First, we sharpened the framing in the final paragraph of the Introduction, which now describes tinydenseR as a "fuzzy set-based, technology-agnostic framework for sample-level modeling and quantitative embedding" built around a shared landmark-based representation. Second, we added a dedicated Results subsection, "The tinydenseR Algorithm," that walks the reader through the core elements of the framework—the fuzzy density matrix, differential density analysis, pePC, plsD and pseudobulk DE—before the application sections. This overview was not present in the original version and now functions as the conceptual bridge between the high-level framing and the full Methods that the reviewer requested. We also revised Figure 1 so that the workflow schematic makes these components visually explicit, with the density matrix shown as the common representation from which all four downstream analysis modes are derived.

      We thus addressed the reviewer's underlying point somewhat differently from the exact suggestion of adding multiple new Introduction paragraphs: the revised manuscript combines a more explicit Introduction paragraph with a dedicated algorithm overview at the start of the Results and an expanded workflow schematic.

      It seems that the probabilities that you get from the fuzzy cluster assignment aren't necessarily well calibrated. After the Benjamini-Hochberg correction this might end up not being a issue, so it would be good to see that the FDR that you mention in line 431 is indeed roughly how many discoveries are made under the null hypothesis for some indicative configurations, and whether this deviates significantly from the expectation.

      Response:

      We agree that the UMAP-derived fuzzy weights should not be interpreted as calibrated probabilities for statistical inference. The revised Methods (Step 4) now describe these quantities as affinities / connection strengths used to construct a continuous abundance surrogate, not calibrated probabilities for statistical modeling.

      To address the concern empirically, we added permutation-based null-distribution analyses across the synthetic benchmark settings (Fig. S4). These analyses report the number of discoveries at q We note that the swfdr-based q-values are plug-in estimates of per-feature FDR, and do not provide a formal guarantee that the full rejection set at q ≤ α is globally FDR-controlled in the BH sense (this is now stated in Methods Step 5). The permutation results are therefore presented as evidence that, in the benchmark settings considered, the method does not produce substantial spurious discovery under arbitrary relabeling, rather than as a proof of exact calibration under all null configurations.

      The rest of my comments are minor and only related to the presentation:

      l71 (and elsewhere): A bit more detail about these datasets would be nice (what the trajectories are, etc, rather than just the configuration).

      The revised manuscript adds substantially more detail for the synthetic trajectory, DA, and DE datasets in both the Results and the dedicated Synthetic Data Methods subsections.

      l73: The language is a little different between this description and the figures, so it's a little hard to keep track of the correspondence.

      The revised manuscript aligns the synthetic sections more explicitly with the figures: trajectory is now Fig. 1, simulated DA is Fig. S1, and simulated DE is Fig. S2, with corresponding figure legends rewritten in the same language.

      Fig S2 (and others): I know that these are supplemental figures, but they're quite dense and hard to read (both conceptually and in terms of the font size). For something supporting a direct claim in the paper, it would be nice to have one or two plots in the main body of the paper that summarise all of the evidence in these denser supplementary plots.

      The revised manuscript adds a stronger main-text overview in Figure 1 and the new "The tinydenseR Algorithm" subsection. The DA/DE benchmark panels remain in the Supplementary Figures, but the revised main-text overview and workflow schematic now guide the reader more explicitly through the supporting evidence.

      l106: This is a little confusing to read (not clear whether these are single and double knock outs or just wild type and double knock out).

      The revised text now states explicitly: WT for HRAS and NRAS versus H/NRAS double knockout (KO).

      l162: In general it would be nice to link this back to some of the choices made in tinydenseR compare to other techniques.

      The revised Discussion now includes explicit comparison to MetaCell / SEACells, to unsupervised sample-representation methods such as PILOT, MrVI, and scPoli, and to the cell type and feature interpretation tools Augur and TRADE.

      l175: 'Fuzzy' density matrix is not really defined anywhere

      The new Results overview explicitly defines the landmark-by-sample density matrix and explains that entries reflect cell-landmark affinity rather than hard membership.

      l427: BH -> Benjamini-Hochberg

      Corrected; thank you.

      Eq 5: Missing a bracket I think

      Corrected; thank you.

      Fig 2b: A log x axis might be better here.

      Figure 2 has been substantially revised, whereby percentages of cells per cluster are not presented anymore. So, the original presentation concern is no longer directly applicable.

      Fig 2c,d,e: Some of the y axis ticks here in particular here look a little broken

      Corrected; thank you.

      Fig S5: DA -> differential abundance, DE -> Differential expression. A guide to what exactly to look at in this figure would also be nice.

      We revised Figure S4 (previously Fig. S5) and its legend to make the figure easier to interpret. DA is now defined as differential abundance and DE as differential expression, and we clarify that the key features to inspect are the number of discoveries at q

      Figure S7: A brief description of the cell types would be nice somewhere.

      Figure S10 (previously Figure S7) has been substantially revised, whereby cell types are not included in the comparison due to the clustering/cell type label-independence structure in the revised manuscript. So, the original presentation concern is no longer directly applicable.

      Reviewer #2 (Significance (Required)):

      Recently there has been a movement to try and go beyond simple cell type annotation, and here the authors present a consistent and scalable framework for this in R. This work develops new strategies to incorporate existing techniques into scRNA-seq analysis pipelines, whilst maintaining computational scalability.

      This work will be particularly interesting to those performing analyses dependent on cell type identification in the R ecosystem, and is demonstrated to be capable of handling the large datasets of modern scRNA-Seq studies.

      Although there not much conceptually novel, the implementation appears to be robust, and the scope of applicability is very broad. As such this work will be of broad interest, from those doing basic science on cell lines, to those analysing data from in vivo or human clinical trials.

      My background is in bioinformatics, physics, and statistics.

      Response:

      We appreciate this assessment and agree that the original version did not make the conceptual contributions sufficiently visible. The revised manuscript now distinguishes more clearly between the novel components—the common landmark-by-sample density representation and the contrast-specific supervised embedding (pePC)—and the conventional components (pseudobulk DE after subset selection). We hope the revision better conveys both the practical scalability and the specific methodological contributions, while being transparent about which downstream steps remain standard.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors present an R package for landmark-based analysis of multi-sample single-cell genomics datasets, demonstrating its utility through applications to both simulated and clinical data.

      Major:

      The current benchmarking does not sufficiently establish the advantages of the proposed method over existing alternatives. Most notably, comparisons with metacell-based approaches are absent, despite these representing the most conceptually similar class of methods. The presented benchmark on differential abundance analysis benchmark lacks a scenario with known ground truth, making it impossible to assess false discovery rate control. Additionally, the comparison with miloR should employ graph-based distance approximations, which substantially improve runtime and memory usage (https://www.biorxiv.org/content/10.1101/2023.11.08.566176v1.full). We recognize that exhaustive benchmarking across all possible scenarios is neither feasible nor necessary. However, the current comparisons should be expanded to clarify the specific strengths and limitations of this approach relative to existing methods. This would help readers assess when the proposed tool offers a meaningful advantage over established alternatives.

      Response:

      We appreciate this careful benchmarking critique. We want to address the metacell comparison directly: we did not add a head-to-head benchmark against metacell-based approaches in this revision. We recognize this as a limitation, and we explain below why we focused the empirical revision on other comparisons.

      Metacell methods (MetaCell, SEACells) aggregate similar cells into group-level representations that serve as compressed versions of the single-cell data. tinydenseR addresses a related but distinct modeling objective: it constructs a per-sample density distribution over a shared set of landmarks rather than a single compressed cell-by-gene representation. The downstream analysis modes (differential density, pePC, plsD) operate on this sample-level density matrix, not on aggregated expression profiles. A meaningful comparison would therefore require defining a shared evaluation task, and the most natural such tasks for tinydenseR—sample-level density modeling, contrast-specific embedding—are not standard outputs of metacell frameworks. We have added this conceptual positioning to the Discussion so that readers can assess the relationship between the approaches.

      For comparisons more directly aligned with the inferential claims of the framework, the revised manuscript includes synthetic DA and DE scenarios with known ground truth, permutation-based null analyses (Fig. S4), and expanded computational and methodological benchmarking against diffcyt, miloR, Seurat, and cell-level Harmony integration. The miloR benchmarking workflow was updated to use refinement_scheme = "graph," as stated in the Methods and Figure 3 legend and the Material and Methods. Even with graph refinement, miloR remained substantially slower than tinydenseR in our benchmarking setting.

      We agree that broader benchmarking across all neighboring method classes remains valuable future work, and we have stated this in the revised manuscript.

      It is unclear whether performing integration at the landmark level yields representations comparable to those obtained through cell-level integration. A systematic comparison between landmark-level and cell-level integration would help establish that the compression step does not introduce meaningful loss of biological signal.

      Response:

      In the revised manuscript, we added a dedicated landmark-level versus cell-level integration benchmark using the Luecken et al. Immune_ALL_human dataset (33,506 cells; 10 batches; 16 annotated cell types). We directly compared tinydenseR landmark-level integration (Harmony on landmarks followed by Symphony projection) against cell-level Harmony on all cells, evaluating integration quality using six metrics: three biological-conservation metrics (MCC for cell-type label transfer, cell-type ASW, graph connectivity) and three batch-mixing metrics (ASW by batch, kBET, batch entropy) (Fig. S5).

      Landmark-level integration matched or slightly exceeded cell-level Harmony on four of six metrics, including all three biological-conservation metrics. Cell-level Harmony performed better on the two neighborhood-based batch-mixing metrics, indicating that compression to landmarks did not measurably impair preservation of major biological structure in this dataset, although some reduction in local batch mixing remained.

      The landmark-based representation provides a per-sample summarized view that could lend itself to broader applications beyond cell state identification. For example, computing sample-level embeddings from the landmark matrix could enable systematic exploration of similarities and differences across large cohorts. Demonstrating such applications would strengthen the case for this framework's utility and help distinguish it from existing alternatives.

      Response:

      The revised manuscript demonstrates this broader utility across multiple datasets. The landmark-by-sample density matrix now supports both unsupervised sample embedding (PCA and diffusion-map trajectory) and supervised quantitative embedding via pePC, as described in the new "The tinydenseR Algorithm" subsection and Methods Step 6.

      Sample-level embeddings are shown in the synthetic DA/DE benchmarks (Figs. S1–S2), the xenograft treatment comparison (Fig. S6c), the COMBAT COVID-19 analysis (Fig. S7d), the longitudinal NIZ985 clinical-trial dataset (Fig. S10d), and multiple compartments and time points of the PHE885 study (Figs. S12c, S13c, Fig. 2c). Across these examples, the embeddings quantify cohort structure, separate effects of interest from nuisance variation, and summarize the fraction of total variance attributable to modeled contrasts. We agree that this broader utility helps distinguish the framework from alternatives, and we have revised the manuscript accordingly.

      Minor:

      The authors repeatedly describe their method as "agnostic to technology," but it is unclear what specific advantage this confers. In practice, most methods operating on embeddings and k-nearest neighbor graphs are similarly technology-agnostic. This claim would benefit from either clarification of the specific sense in which this property is distinctive, or a demonstration involving joint analysis of data generated across different technologies.

      Response:

      The reviewer is right that most methods operating on embeddings and kNN graphs are, in principle, technology-agnostic. Our use of the term is more specific and practical: the same end-to-end framework—from landmark construction through density modeling, embedding, and feature interpretation—is implemented out of the box in tinydenseR for both scRNA-seq and flow/mass/spectral cytometry, with modality-appropriate preprocessing but a shared downstream workflow. The revised Methods now list the supported input formats (Seurat, SingleCellExperiment, H5AnnData, BPCells, dgCMatrix, DelayedMatrix, cytoset) to make this concrete.

      We do not demonstrate joint analysis across different technologies in this manuscript, and we have therefore added a note in the Discussion that extending toward joint cross-technology analysis is a natural next step.

      The method description in the main text is insufficient to guide the reader through the subsequent applications. Starting with a more detailed overview of the key steps and assumptions underlying the approach would help readers interpret the results presented in later sections.

      We agree that the original version did not provide enough methodological guidance before the applications. The revised manuscript adds a dedicated Results subsection, "The tinydenseR Algorithm," that introduces the key steps and assumptions of the framework before the application sections. We also revised Figure 1 and expanded the relevant Methods sections so that the conceptual overview in the main text is clearly connected to the implementation details used in the later applications.

      Reviewer #3 (Significance (Required)):

      While the tool addresses a relevant need in the field, the underlying summarization approach is conceptually similar to established methods for multi-sample single-cell analysis, including those based on metacells (https://pubmed.ncbi.nlm.nih.gov/31604482/, https://pubmed.ncbi.nlm.nih.gov/35440087/, https://pubmed.ncbi.nlm.nih.gov/35963997/, https://pubmed.ncbi.nlm.nih.gov/36973557/) or data downsampling/sketching (https://pubmed.ncbi.nlm.nih.gov/31176620/). Given the limited conceptual novelty relative to these existing frameworks, the significance of this contribution is difficult to assess without more extensive benchmarking against comparable methods.

      Response:

      We appreciate this perspective. As discussed in our response to the benchmarking comment above, the revised manuscript positions tinydenseR explicitly as a sample-level landmark-density modeling framework rather than a compression or sketching strategy. The key distinction is that the downstream analyses operate on a per-sample density distribution over landmarks, not on aggregated expression profiles, which is why the empirical revision focused on benchmarks directly aligned with these inferential and computational claims. The Discussion now includes explicit positioning relative to metacell and sketching approaches, and we agree that broader cross-method benchmarking is valuable future work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors present an R package for landmark-based analysis of multi-sample single-cell genomics datasets, demonstrating its utility through applications to both simulated and clinical data.

      Major:

      The current benchmarking does not sufficiently establish the advantages of the proposed method over existing alternatives. Most notably, comparisons with metacell-based approaches are absent, despite these representing the most conceptually similar class of methods. The presented benchmark on differential abundance analysis benchmark lacks a scenario with known ground truth, making it impossible to assess false discovery rate control. Additionally, the comparison with miloR should employ graph-based distance approximations, which substantially improve runtime and memory usage (https://www.biorxiv.org/content/10.1101/2023.11.08.566176v1.full). We recognize that exhaustive benchmarking across all possible scenarios is neither feasible nor necessary. However, the current comparisons should be expanded to clarify the specific strengths and limitations of this approach relative to existing methods. This would help readers assess when the proposed tool offers a meaningful advantage over established alternatives. It is unclear whether performing integration at the landmark level yields representations comparable to those obtained through cell-level integration. A systematic comparison between landmark-level and cell-level integration would help establish that the compression step does not introduce meaningful loss of biological signal. The landmark-based representation provides a per-sample summarized view that could lend itself to broader applications beyond cell state identification. For example, computing sample-level embeddings from the landmark matrix could enable systematic exploration of similarities and differences across large cohorts. Demonstrating such applications would strengthen the case for this framework's utility and help distinguish it from existing alternatives.

      Minor:

      The authors repeatedly describe their method as "agnostic to technology," but it is unclear what specific advantage this confers. In practice, most methods operating on embeddings and k-nearest neighbor graphs are similarly technology-agnostic. This claim would benefit from either clarification of the specific sense in which this property is distinctive, or a demonstration involving joint analysis of data generated across different technologies. The method description in the main text is insufficient to guide the reader through the subsequent applications. Starting with a more detailed overview of the key steps and assumptions underlying the approach would help readers interpret the results presented in later sections.

      Significance

      While the tool addresses a relevant need in the field, the underlying summarization approach is conceptually similar to established methods for multi-sample single-cell analysis, including those based on metacells (https://pubmed.ncbi.nlm.nih.gov/31604482/, https://pubmed.ncbi.nlm.nih.gov/35440087/, https://pubmed.ncbi.nlm.nih.gov/35963997/, https://pubmed.ncbi.nlm.nih.gov/36973557/) or data downsampling/sketching (https://pubmed.ncbi.nlm.nih.gov/31176620/). Given the limited conceptual novelty relative to these existing frameworks, the significance of this contribution is difficult to assess without more extensive benchmarking against comparable methods.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors describe a new computational framework for the analysis of single-cell RNA-seq data in multiple domains that uses probabilistic cluster assignment for downstream differential expression and abundance analysis. They show that their method outperforms similar methods in certain tasks, and provide computational performance studies to show that this is tractable on large scale RNA-seq datasets.

      I congratulate the authors on a well written and interesting manuscript, and the development of what seems like quite a comprehensive analysis package. I particularly appreciate the effort to demonstrate computational performance on large datasets (and of course the great name!). Nevertheless, I have a few minor comments that I think would enhance the presentation and impact of the paper.

      From the introduction, it's a little hard to determine what exactly tinydenseR is doing, what the underlying philosophy is, and what the key innovations are. This is especially apparent as the paper then launches straight into the results. The methods section is very comprehensive, however without a 'guide' through this that bridges the gap between the very high level description and the implementation details, a reader could end up getting stuck. So my suggestion would be to add a couple of paragraphs to the introduction that provide this, such that the reader would only need to consult the methods for the real implementation details.

      It seems that the probabilities that you get from the fuzzy cluster assignment aren't necessarily well calibrated. After the Benjamini-Hochberg correction this might end up not being a issue, so it would be good to see that the FDR that you mention in line 431 is indeed roughly how many discoveries are made under the null hypothesis for some indicative configurations, and whether this deviates significantly from the expectation.

      The rest of my comments are minor and only related to the presentation:

      • l71 (and elsewhere): A bit more detail about these datasets would be nice (what the trajectories are, etc, rather than just the configuration).

      • l73: The language is a little different between this description and the figures, so it's a little hard to keep track of the correspondence.

      • Fig S2 (and others): I know that these are supplemental figures, but they're quite dense and hard to read (both conceptually and in terms of the font size). For something supporting a direct claim in the paper, it would be nice to have one or two plots in the main body of the paper that summarise all of the evidence in these denser supplementary plots.

      • l106: This is a little confusing to read (not clear whether these are single and double knock outs or just wild type and double knock out).

      • l162: In general it would be nice to link this back to some of the choices made in tinydenseR compare to other techniques.

      • l175: 'Fuzzy' density matrix is not really defined anywhere

      • l427: BH -> Benjamini-Hochberg

      • Eq 5: Missing a bracket I think

      • Fig 2b: A log x axis might be better here.

      • Fig 2c,d,e: Some of the y axis ticks here in particular here look a little broken

      • Fig S5: DA -> differential abundance, DE -> Differential expression. A guide to what exactly to look at in this figure would also be nice.

      • Figure S7: A brief description of the cell types would be nice somewhere.

      Significance

      Recently there has been a movement to try and go beyond simple cell type annotation, and here the authors present a consistent and scalable framework for this in R. This work develops new strategies to incorporate existing techniques into scRNA-seq analysis pipelines, whilst maintaining computational scalability.

      This work will be particularly interesting to those performing analyses dependent on cell type identification in the R ecosystem, and is demonstrated to be capable of handling the large datasets of modern scRNA-Seq studies.

      Although there not much conceptually novel, the implementation appears to be robust, and the scope of applicability is very broad. As such this work will be of broad interest, from those doing basic science on cell lines, to those analysing data from in vivo or human clinical trials.

      My background is in bioinformatics, physics, and statistics.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      1) Overall, this manuscript presents a landmark-based strategy for scaling sample-level differential abundance and differential expression analyses to atlas-sized single-cell datasets by summarizing cells through fuzzy cell-landmark memberships and then applying standard sample-level modeling. The approach is promising for computational efficiency, but several key methodological claims and design choices would benefit from clearer justification and stronger empirical validation.

      2) The manuscript's main methodological novelty appears to be computational/scalable representation learning via landmarking and fuzzy cell-landmark densities. However, for differential expression within cell states, the inferential procedure is standard pseudo-bulk + limma/voom and still relies on discrete subset definitions (clusters/cell types, or landmark bins) and downstream annotation for biological interpretation. The authors should clarify that "clustering-independent" primarily applies to the DA representation/testing, and strengthen evidence that landmark-level subsets provide materially different biological resolution than simply clustering a subsample.

      3) The method treats UMAP fuzzy graph weights as "connection probabilities" and uses their sums to estimate sample-level abundance around each landmark. Please clarify (i) how sensitive the density matrix is to the UMAP membership construction (e.g., choice of parameters, the per-cell membership mass constraint), and (ii) why this is an appropriate abundance surrogate in the context of sample-level inference.

      4) The landmark selection step caps landmarks at 10% of cells per sample (and an overall cap of 5,000 landmarks by default), but the rationale for these thresholds is not clearly justified. Additionally, interaction between the per-sample cap and the global 5,000 landmark cap: when the global cap is binding, how are landmarks redistributed across samples, and does this induce unequal representation across samples with different cell yields?

      Significance

      This manuscript introduces a practical landmark-based framework that makes sample-level differential abundance and expression analyses feasible at atlas scale by summarizing large cell collections through a reduced set of representative cells and fuzzy cell-landmark memberships, then leveraging well-established sample-level linear modeling. The main strengths are its emphasis on scalability, a clear engineering workflow that can be applied across large studies, and a unified representation that can support multiple downstream analyses. The main limitations are that key inferential components remain conventional once subsets are defined, biological interpretation still relies on clustering/label transfer and downstream annotation, and several core design choices (e.g., treating UMAP fuzzy weights as "probabilities" and the default landmark caps) would benefit from stronger justification and sensitivity analyses to support the paper's broader claims about cluster-independence and biological resolution.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Decker et al present an interesting study of the order of events in tau seeding in a biosensor HEK293 derived cell line. This a critical unresolved question in the field about subcellular compartment contributions to tau aggregation. This exploration of the nuclear tau aggregated deposition and seeding in HEK293T tau biosensor cells uses a variety of imaging-based methods. They show that nuclear aggregates only form in cells with cytosolic aggregates, nuclear aggregates cannot form in the absence of cytoplasmic tau aggregates. The original tau seeds do not persist. Also aggregates in the nucleus are dependent on VCP and SRRM2 for formation. The paper is limited in scope by use of only HEK239T cells and seem to overstate the generalizability of the findings to neuronal mechanisms of neurodegeneration. Please see to all tauopathies. In particular, the manuscript does not outline the overstatement of some of the conclusions.

      Key Points to address: 1. The manuscript does not detail limitations of the study in the discussion. Please address the concern that HEK293T biosensor cells are not neurons. Especially in the clear animations showing the transformation from cytoplasmic to nuclear aggregates appears to require cell division and nuclear breakdown.

              We agree with the reviewer that a limitation of this manuscript is we only used HEK293 cells.  We have added text to emphasize this point in a "Limitations of this study" section at the end of the discussion.  However, as a starting point we believe understanding the cell biology of protein aggregation even in non-neuronal cells can be of value.
      
              Moreover, we clearly see cases of nuclear tau aggregates forming without cell division and nuclear breakdown (Figure 1 and Movies).  We have added text to emphasize this point since it is relevant to the potential formation of nuclear aggregates in neurons and the reviewer must have missed this point.
      

      The introduction sets this up as Alzheimer's disease relevant but all studies are down with P301S tau which is a distinct and particularly aggressive form of tauopathy (FTLD-Tau). There is no amyloid beta component to any of these studies.

              This is a good point, and we have clarified our use of an FTLD model.  We do note that since seeds from post-mortem tissues in several different tauopathies can give nuclear tau aggregates (Sanders et al 2014), we anticipate that this process is general to multiple tauopathy contexts.
      

      The study does not address the peculiar structure of P301S aggregates, which while disease relevant are clearly distinct from AD or most forms of familial FTLD. The authors should limit the generalizability of the findings to their particular form of tauopathy unless they plan to use multiple tau fibril conformations in their studies.

              The reviewer points out that we have only used one model system, and presumably only one tau fibril structure and therefore we should be cautious about the generality of our results.  This is a valid point, and we now point out this limitation in the manuscript.
      

      The authors do not address the potential impact of fusing a natively unfolded protein like tau to a highly structured beta barrel like GFP. Please present this potential confound.

              We have added text pointing out that using GFP fusion proteins has the potential to alter tau function. We note this is an issue in the use of any fusion proteins, which have nevertheless proven useful tools.
      
      1. Inhibition of VCP can cause proteinopathies in the absence of other seeding. For instance, familial mutations in human VCP can cause either tau or TDP-43 proteinopathy depending on the specific human disease causing mutation. Thus, critical controls are missing from figure 3. For instance, the consequence of VCP inhibition on unseeded biosensor cells is a missing control. Second all panels should evaluate TDP-43 aggregation to ascertain whether or not the secondary nuclear seeding involves TDP-43.
            In this comment, the reviewer asks that we show the effects of VCP inhibition on unseeded cells.  We will add this control, and we observe no appreciable tau aggregation with tau seeding.
        
            We will also assess whether TDP-43 aggregates in the HEK293 biosensor cells with or without VCP inhibition and/or tau seeding.  However, we note that it is clear from many studies that tau aggregation can occur independently of TDP-43 aggregation.
        

      Minor concerns: A. Line 635 - In line 380, they discuss that aggregation of tau does not lead to perturbations in nuclear transport. In line 390, they discuss that aggregation of tau does not affect nuclear envelope integrity or nuclear import. However, in the discussion discusses that aggregation alters nuclear RNA export. These statements could use clarifying that protein export is not perturbed but RNA export and import may be.

      We have clarified this point.

      B. Line 564: "This observation suggests that tau aggregation in the cytoplasm may lead to increased expression of some RNAs." This could also be that cytoplasmic tau alters RNA export. These experiments don't differentiate between these options.

      This comment is related to other comments about the relative abundance of specific RNAs in the nucleus or cytoplasm. We will add new data to the manuscript where we examine the numbers of specific RNAs in cells with and without nuclear or cytoplasmic tau aggregates. This will allow us to determine if there is simply a retention of RNAs in the nucleus or if, in some cases, there is also an increase in RNA levels.

      1. In Figure 1, the authors show large aggregates overlapping the nucleus. It is unclear whether these aggregates have a portion both within and outside the nucleus or if they are deforming the nucleus and are wholly external to the nuclear compartment. Clarity on this issue is important. If the nucleus is deferment the observed aggregates seem reminiscent of aggresome formation. Please clarify. We assume the reviewer asks us to clarify why the large cytoplasmic tau aggregates are localized near the nucleus. Indeed, we suspect these are accumulating in aggresomes over time and have added this point to the text. Importantly, we do not observe a general defect in the integrity of the nucleus suggesting that even those these assemblies are close to the nucleus, they are not altering the nuclear envelope. We have added text to explain this issue.

      Reviewer #1 (Significance (Required)):

      Decker et al present an interesting study of the order of events in tau seeding in a biosensor HEK293 derived cell line. This a critical unresolved question in the field about subcellular compartment contributions to tau aggregation. This exploration of the nuclear tau aggregated deposition and seeding in HEK293T tau biosensor cells uses a variety of imaging-based methods. They show that nuclear aggregates only form in cells with cytosolic aggregates, nuclear aggregates cannot form in the absence of cytoplasmic tau aggregates.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Decker et al. examines the formation of nuclear tau aggregates and their functional consequences using a HEK293T tau biosensor system. The authors propose that nuclear tau aggregates arise through VCP dependent secondary seeding from cytoplasmic aggregates and that these nuclear aggregates impair RNA export. The study addresses an important and relatively unexplored aspect of tau biology. However, several conclusions extend beyond what the data directly supports, and several essential controls are missing. Major Comments - The introduction is generally clear and well organized. However, it would benefit from additional mechanistic context explaining how polyserine domains might promote tau aggregation and why this feature is biologically relevant.

              We have added text expanding what is known about how polyserine domains can increase tau fibrillization.
      
      • The live cell imaging convincingly demonstrates the temporal sequence of cytoplasmic followed by nuclear aggregation. However, the manuscript lacks controls assessing whether nuclear aggregation correlates with higher tau expression levels. Such controls are necessary to exclude expression driven artifacts.

            We will add an analysis of the relationship between tau expression levels and cells with nuclear tau aggregates.  We observed that tau aggregates were independent of the tau expression levels, ruling out that nuclear tau aggregates are solely an artifact of extremely high tau expression levels.
        
      • The authors conclude that nuclear envelope integrity is preserved, but only import assays were performed. To validate the sensitivity and specificity of the assay, export assays or positive controls for nuclear transport disruption are required.

            We had already shown that in cells with nuclear tau aggregates the nuclear export of mRNAs is perturbed. We will add additional analyses of whether nuclear export of proteins is altered.
        
      • The Cy3/Cy5 seed experiments support the claim that exogenous seeds do not enter the nucleus. However, the conclusion that VCP generates secondary seeds is overstated. For example, the manuscript states: "VCP is responsible for the formation of secondary seeds..." (lines 418-439), yet the data demonstrate correlation rather than direct evidence of seed generation.

            This is a valid point. We have rephrased the manuscript to note that VCP is required for nuclear tau aggregation, possibly through the formation of secondary tau seeds, which is consistent with earlier work suggesting VCP can generate new tau seeds (Saha et al., 2023, Nature Communications; Batra et al., 2025, Molecular Neurodegeneration).
        

      To substantiate this conclusion, the authors should: directly quantify seed abundance. The current interpretation assumes uniform cytoplasmic uptake of seeds but does not measure it; Include controls addressing VCP inhibitor specificity, as these compounds have pleiotropic effects (e.g., ER stress, proteostasis collapse). No data is provided on whether VCP inhibition alters tau ubiquitination, which could have major implications on tau aggregation.

                    This comment addresses the issue of whether VCP can generate new seeds from tau fibers.  This is a conclusion already reached by prior work (Saha et al., 2023, Nature Communications; Batra et al., 2025, Molecular Neurodegeneration).  The point of our manuscript that this comment addresses is whether the nuclear aggregates are forming from a secondary seeding event, for which we have already provided several lines of evidence.  First, we have shown that nuclear aggregates only form after the formation of a prior cytoplasmic tau aggregate (Figure 1). Second, we have shown that nuclear aggregates do not contain exogenous seeds, while all cytoplasmic tau aggregates do (Figure 2).  Finally, we have shown that nuclear tau aggregates are dependent on VCP, which is consistent with the prior work showing VCP can generate tau seeds.  It is beyond the scope of this manuscript to determine in more detail how VCP affect tau aggregates generally.  For this reason, and since we have robustly demonstrated our conclusion, we have chosen not to pursue these additional suggested experiments.
      
      • The authors observed increase in nuclear poly(A)+ RNA and specific transcripts. However, the current data do not distinguish between several possible mechanisms that may account for this increase, including impaired export, increased transcription, enhanced RNA stability, or nuclear retention due to speckle reorganization.

            To address this comment, we will quantify the levels of individual RNAs in the nucleus, cytoplasm and whole cell.  This will allow us to determine if there is an increase in RNA levels (possibly due to increased transcription or reduced decay), or if the increased nuclear RNA levels are due to block to mRNA export.  We will also assess transcription rate by measuring the intensity of the transcription loci, which will allow us to distinguish if any changes in mRNA levels are due to transcription or changes in RNA decay.
        
      • The discussion occasionally overinterprets the data. Several statements should be reframed as hypotheses rather than conclusions:

      • "VCP can generate tau seeds capable of additional seeding within a cell." (lines 572-594) This has not been directly demonstrated and should be softened accordingly.

      We have done so. 2. Active import via SRRM2 is proposed, but no supporting data are presented. This should be clearly framed as a speculative model.

      We have done so. 3. "Tau aggregates in the nucleus alter the function of nuclear speckles..." (lines 616-637). While plausible, this is not directly shown. Alternative explanations such as transcriptional upregulation or stress induced changes should be acknowledged.

      We have altered this text to be more accurate. 4. The statement "It is possible that such nuclear aggregates could alter nuclear RNA export and contribute to pathology." (lines 637-655) is reasonable, but the authors should emphasize that nuclear tau aggregates are not consistently observed across tauopathies and that the HEK293T biosensor system may not fully recapitulate neuronal biology.

      We agree with this point and have rephrased the text accordingly.

      Reviewer #2 (Significance (Required)):

      The manuscript by Decker et al. examines the formation of nuclear tau aggregates and their functional consequences using a HEK293T tau biosensor system. The authors propose that nuclear tau aggregates arise through VCP dependent secondary seeding from cytoplasmic aggregates and that these nuclear aggregates impair RNA export. The study addresses an important and relatively unexplored aspect of tau biology. However, several conclusions extend beyond what the data directly supports, and several essential controls are missing.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In their manuscript, Decker et al., investigate the relationship between cytoplasmic and nuclear tau aggregation using a HEK293T biosensor system and propose a multistep model in which cytoplasmic aggregates give rise to nuclear tau aggregates, potentially via VCP-dependent secondary seed generation and involvement of nuclear speckle components. The study further explores functional consequences of nuclear tau aggregation on RNA metabolism. Overall, the work is interesting and potentially impactful. The combination of live-cell imaging, seed-labeling strategies, genetic perturbations (SRRM2/PNN), and RNA imaging represents a thoughtful experimental approach. However, I have some minor concerns and feel the authors should address these - 1. Poly(A)+ FISH intensity is not a direct measure of export efficiency. The authors claim that tau aggregation within nuclear speckles interferes with nuclear export of RNA. It is highly possible that increased nuclear RNA levels observed could reflect altered transcription, stability, or stress responses rather than export defects alone. In the case of ATF3, a known stress responsive gene, increased nuclear signal could reflect transcriptional activation, not export defects. To prove that export is defective, the authors should at least measure total RNA levels (qPCR) in nuclear vs cytoplasmic fraction.

              To address this issue, we will quantify the levels of specific RNAs in the nucleus and cytoplasm by smFISH, which will allow us to clarify why there are more RNAs associated with nuclear speckles in the context of nuclear tau aggregates.
      

      Though the authors have shown the proposed role of VCP in generating secondary seeds by using inhibitors, the authors should show genetic validation by using dominant-negative VCP.

              This experiment essentially asks us to examine the role of VCP in nuclear tau aggregation by an additional method. We will add experiments examining how nuclear tau aggregates form when VCP is knocked down by siRNAs.  We have chosen not to use dominant negative VCP mutants since their phenotype will be complicated with the endogenous VCP possibly remaining functional.
      

      **Referees cross-commenting** *This session contains comments from different reviweers* Reviewer 3 I agree with the reviewers that additional controls and experiments would strengthen the VCP inhibition studies. However, I would like to clarify that the specific concern raised by Reviewer 1 (Key point number 4) regarding fusion of tau to GFP does not apply to this manuscript. In this study, the authors use tau conjugated to Cy3, a well established approach in the field that adds only approximately 1 kDa to the protein.

      Reviewer 1 Apologies reviewer 3, but I respectfully disagree. Please look again at the legends for figs 1 through fig 5. All clearly delineate the use of tau biosensor cells using a YFP rather than GFP fusion protein with tau. i do agree we should correct my review to state YFP rather than GFP, but structurally the concern remains the same. Cy3 labelling, I believe is used to track the relatively short lived exogenous seeds.

      Reviewer #3 (Significance (Required)):

      The integration of approaches presented here, especially in connecting tau aggregation with nuclear speckle biology and RNA processing, will be of broad interest and offers important new mechanistic insights into tau pathology. I am an expert in Alzheimer's disease and integrated stress response.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In their manuscript, Decker et al., investigate the relationship between cytoplasmic and nuclear tau aggregation using a HEK293T biosensor system and propose a multistep model in which cytoplasmic aggregates give rise to nuclear tau aggregates, potentially via VCP-dependent secondary seed generation and involvement of nuclear speckle components. The study further explores functional consequences of nuclear tau aggregation on RNA metabolism. Overall, the work is interesting and potentially impactful. The combination of live-cell imaging, seed-labeling strategies, genetic perturbations (SRRM2/PNN), and RNA imaging represents a thoughtful experimental approach. However, I have some minor concerns and feel the authors should address these -

      1. Poly(A)+ FISH intensity is not a direct measure of export efficiency. The authors claim that tau aggregation within nuclear speckles interferes with nuclear export of RNA. It is highly possible that increased nuclear RNA levels observed could reflect altered transcription, stability, or stress responses rather than export defects alone. In the case of ATF3, a known stress responsive gene, increased nuclear signal could reflect transcriptional activation, not export defects. To prove that export is defective, the authors should at least measure total RNA levels (qPCR) in nuclear vs cytoplasmic fraction.
      2. Though the authors have shown the proposed role of VCP in generating secondary seeds by using inhibitors, the authors should show genetic validation by using dominant-negative VCP.

      Referees cross-commenting

      This session contains comments from different reviewers

      Reviewer 3

      I agree with the reviewers that additional controls and experiments would strengthen the VCP inhibition studies. However, I would like to clarify that the specific concern raised by Reviewer 1 (Key point number 4) regarding fusion of tau to GFP does not apply to this manuscript. In this study, the authors use tau conjugated to Cy3, a well established approach in the field that adds only approximately 1 kDa to the protein.

      Reviewer 1

      Apologies reviewer 3, but I respectfully disagree. Please look again at the legends for figs 1 through fig 5. All clearly delineate the use of tau biosensor cells using a YFP rather than GFP fusion protein with tau. i do agree we should correct my review to state YFP rather than GFP, but structurally the concern remains the same. Cy3 labelling, I believe is used to track the relatively short lived exogenous seeds.

      Significance

      The integration of approaches presented here, especially in connecting tau aggregation with nuclear speckle biology and RNA processing, will be of broad interest and offers important new mechanistic insights into tau pathology. I am an expert in Alzheimer's disease and integrated stress response.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Decker et al. examines the formation of nuclear tau aggregates and their functional consequences using a HEK293T tau biosensor system. The authors propose that nuclear tau aggregates arise through VCP dependent secondary seeding from cytoplasmic aggregates and that these nuclear aggregates impair RNA export. The study addresses an important and relatively unexplored aspect of tau biology. However, several conclusions extend beyond what the data directly supports, and several essential controls are missing.

      Major Comments

      • The introduction is generally clear and well organized. However, it would benefit from additional mechanistic context explaining how polyserine domains might promote tau aggregation and why this feature is biologically relevant.
      • The live cell imaging convincingly demonstrates the temporal sequence of cytoplasmic followed by nuclear aggregation. However, the manuscript lacks controls assessing whether nuclear aggregation correlates with higher tau expression levels. Such controls are necessary to exclude expression driven artifacts.
      • The authors conclude that nuclear envelope integrity is preserved, but only import assays were performed. To validate the sensitivity and specificity of the assay, export assays or positive controls for nuclear transport disruption are required.
      • The Cy3/Cy5 seed experiments support the claim that exogenous seeds do not enter the nucleus. However, the conclusion that VCP generates secondary seeds is overstated. For example, the manuscript states: "VCP is responsible for the formation of secondary seeds..." (lines 418-439), yet the data demonstrate correlation rather than direct evidence of seed generation. To substantiate this conclusion, the authors should: directly quantify seed abundance. The current interpretation assumes uniform cytoplasmic uptake of seeds but does not measure it; Include controls addressing VCP inhibitor specificity, as these compounds have pleiotropic effects (e.g., ER stress, proteostasis collapse). No data is provided on whether VCP inhibition alters tau ubiquitination, which could have major implications on tau aggregation.
      • The authors observed increase in nuclear poly(A)+ RNA and specific transcripts. However, the current data do not distinguish between several possible mechanisms that may account for this increase, including impaired export, increased transcription, enhanced RNA stability, or nuclear retention due to speckle reorganization.
      • The discussion occasionally overinterprets the data. Several statements should be reframed as hypotheses rather than conclusions:

      • "VCP can generate tau seeds capable of additional seeding within a cell." (lines 572-594) This has not been directly demonstrated and should be softened accordingly.

      • Active import via SRRM2 is proposed, but no supporting data are presented. This should be clearly framed as a speculative model.
      • "Tau aggregates in the nucleus alter the function of nuclear speckles..." (lines 616-637). While plausible, this is not directly shown. Alternative explanations such as transcriptional upregulation or stress induced changes should be acknowledged.
      • The statement "It is possible that such nuclear aggregates could alter nuclear RNA export and contribute to pathology." (lines 637-655) is reasonable, but the authors should emphasize that nuclear tau aggregates are not consistently observed across tauopathies and that the HEK293T biosensor system may not fully recapitulate neuronal biology.

      Significance

      The manuscript by Decker et al. examines the formation of nuclear tau aggregates and their functional consequences using a HEK293T tau biosensor system. The authors propose that nuclear tau aggregates arise through VCP dependent secondary seeding from cytoplasmic aggregates and that these nuclear aggregates impair RNA export. The study addresses an important and relatively unexplored aspect of tau biology. However, several conclusions extend beyond what the data directly supports, and several essential controls are missing.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Decker et al present an interesting study of the order of events in tau seeding in a biosensor HEK293 derived cell line. This a critical unresolved question in the field about subcellular compartment contributions to tau aggregation. This exploration of the nuclear tau aggregated deposition and seeding in HEK293T tau biosensor cells uses a variety of imaging-based methods. They show that nuclear aggregates only form in cells with cytosolic aggregates, nuclear aggregates cannot form in the absence of cytoplasmic tau aggregates. The original tau seeds do not persist. Also aggregates in the nucleus are dependent on VCP and SRRM2 for formation. The paper is limited in scope by use of only HEK239T cells and seem to overstate the generalizability of the findings to neuronal mechanisms of neurodegeneration. Please see to all tauopathies. In particular, the manuscript does not outline the overstatement of some of the conclusions.

      Key Points to address:

      1. The manuscript does not detail limitations of the study in the discussion. Please address the concern that HEK293T biosensor cells are not neurons. Especially in the clear animations showing the transformation from cytoplasmic to nuclear aggregates appears to require cell division and nuclear breakdown.
      2. The introduction sets this up as Alzheimer's disease relevant but all studies are down with P301S tau which is a distinct and particularly aggressive form of tauopathy (FTLD-Tau). There is no amyloid beta component to any of these studies.
      3. The study does not address the peculiar structure of P301S aggregates, which while disease relevant are clearly distinct from AD or most forms of familial FTLD. The authors should limit the generalizability of the findings to their particular form of tauopathy unless they plan to use multiple tau fibril conformations in their studies.
      4. The authors do not address the potential impact of fusing a natively unfolded protein like tau to a highly structured beta barrel like GFP. Please present this potential confound.
      5. Inhibition of VCP can cause proteinopathies in the absence of other seeding. For instance, familial mutations in human VCP can cause either tau or TDP-43 proteinopathy depending on the specific human disease causing mutation. Thus, critical controls are missing from figure 3. For instance, the consequence of VCP inhibition on unseeded biosensor cells is a missing control. Second all panels should evaluate TDP-43 aggregation to ascertain whether or not the secondary nuclear seeding involves TDP-43.

      Minor concerns:

      A. Line 635 - In line 380, they discuss that aggregation of tau does not lead to perturbations in nuclear transport. In line 390, they discuss that aggregation of tau does not affect nuclear envelope integrity or nuclear import. However, in the discussion discusses that aggregation alters nuclear RNA export. These statements could use clarifying that protein export is not perturbed but RNA export and import may be.

      B. Line 564: "This observation suggests that tau aggregation in the cytoplasm may lead to increased expression of some RNAs." This could also be that cytoplasmic tau alters RNA export. These experiments don't differentiate between these options.

      C. In Figure 1, the authors show large aggregates overlapping the nucleus. It is unclear whether these aggregates have a portion both within and outside the nucleus or if they are deforming the nucleus and are wholly external to the nuclear compartment. Clarity on this issue is important. If the nucleus is deferment the observed aggregates seem reminiscent of aggresome formation. Please clarify.

      Significance

      Decker et al present an interesting study of the order of events in tau seeding in a biosensor HEK293 derived cell line. This a critical unresolved question in the field about subcellular compartment contributions to tau aggregation. This exploration of the nuclear tau aggregated deposition and seeding in HEK293T tau biosensor cells uses a variety of imaging-based methods. They show that nuclear aggregates only form in cells with cytosolic aggregates, nuclear aggregates cannot form in the absence of cytoplasmic tau aggregates.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We would like to thank the reviewer for their constructive comments on our manuscript. We have addressed all comments made by the reviewers by additional experimental data, data analyses, and text edits. A detailed point-by-point response to the reviewers is documented below.

      Summary of new/amended data panels

      Fig 2C (Rev 2): Cell-by-cell quantification of the GFP fluorescence intensity as a surrogate measure of wild-type (WT) vs mutant Pfn1 rescue construct expression levels in B16F1 KO-rescue studies.

      Figs 1B, 2A, 3C, 4A, 4C (Rev 1, 3): Inclusion of zoomed images of PIP2 staining of select regions of interests.

      Figs 6B, 6D (Rev 2): Quantification of phospho-PKC substrate antibody immunoblots of MDA-231 and B16F1 cells with or without Pfn1 KO.

      Fig 3E (not requested by the reviewers): Time-lapse images of PIP2 biosensor and F-actin in HEK-293 cells.

      __Fig 3H (Rev 3): __Half-life comparison of LatB-induced PIP2 and F-actin responses

      Fig S1 (Rev 1): F-actin and PIP2 staining of MDA-231 cells with or without treatments of myosin inhibitor blebbistatin.

      Figs 6G-I (Rev 2, 3): Quantification of various parameters from Ca2+ imaging studies.

      Fig 6J-M (Rev 2): __Images and quantification of correlative PIP2 and DAG biosensor studies __in HEK-293 cells.

      Fig 7 (not requested by the reviewers)__: __A schematic model of how Pfn1 loss leads to PIP2 reduction in cells.

      Fig S2 (not requested by the reviewers): Effect of Pfn1 knockdown on PI4P in HEK-293 cells.

      Fig S3B (Rev 2): A list of top 100 (50 up, 50 down) differentially expressed genes in response to Pfn1 KO in MDA-231 cells.

      Point-by-Point response

      __REVIEWER 1: __

      1. "The quantifications of the PIP2 levels were apparently done simply by measuring the fluorescence intensities of wild-type and knockout cells stained with monoclonal actin-PIP2 antibody. However, the knockout cells appear more spread compared to the wild-type cells (Fig. 1B), and this can possibly affect the quantifications (e.g. there may be more plasma membrane ruffles/folds in the wild-type cells). Thus, I recommend that in all critical quantifications the authors would also use a general plasma membrane marker to confirm that the PIP2-density (and not just morphology of the plasma membrane) is indeed affected by Pfn1-depletion". Response: For PM PIP2 analysis, we specifically quantified the total rather than the average PM PIP2 staining intensity (as also previously done in other studies - Hammond et al. J. Cell Science 2006; Biochem. J 2009) for three reasons. First, PIP2 is non-uniformly distributed across the PM, and therefore the average intensity calculation collapses a lot of biologically meaningful spatial information. Second, the average intensity calculation is impacted by significant cell shape and area differences that exist between cells within a group as well as between groups. Third, the integrated PM intensity is a better metric of how much total PIP2 is available for metabolic turnover on a cell-by-cell basis. These justifications are now detailed in the revised manuscript.

      In our previous study (Ricci et al., J. Biol. Chem 2024, PMID 38141770), we utilized orthogonal techniques (immunostaining, lipid dot blot) in multiple cell lines to demonstrate that total PIP2 as well as PIP2 intensity at the plasma membrane (PM) (based on manual tracing of hundreds of cells in immunostaining experiments) are reduced by silencing Pfn1 expression, and conversely, elevated upon Pfn1 overexpression. We would like to clarify here that in our present study we used an automated pipeline in "cell profiler" to detect cell edges and quantify integrated PM intensity of PIP2 in control vs Pfn1 knockout (KO) cells, and our present findings in Pfn1 KO setting recapitulated our previous findings in transient knockdown setting. While our cell-profile pipeline accurately detects the cell edges, we address the reviewer's comment on confirmation of findings with a PM marker by providing new experimental data in HEK-293 cells transfected with fluorescence biosensors of PIP2 and DAG along with a PM marker (iRFP-Lyn11), which also shows reduction of PIP2 fluorescence staining at the Lyn11-positive PM regions in Pfn1 knockdown cells relative to control cells (see new data panels Figs 6J, L).

      "To get a better idea about which cellular actin filament structures are important for regulating the PIP2-levels at the plasma membrane, one could also use a larger repertoire of actin/myosin inhibitors (CK666, cytochalasin-B, blebbistatin). By using these compounds, one may e.g. uncover if the Arp2/3-nucleated branched actin networks and/or contractile actomyosin structures would specifically contribute to regulation of the plasma membrane PIP2 levels".

      Response: We thank the reviewer for this suggestion. We have now evaluated the effect of blebbistatin treatment on PIP2 in MDA-231 cells (now shown supplementary Fig S1). A previous study showed that the major effects of blebbistatin on actin cytoskeleton are disintegration of actin stress fibers, softening of cortical actin, and transformation of lamellipodial actin into loose network of accumulated amorphous actin structures that correspond to membrane ruffles (Shutova et al., 2012). These phenotypes were also recapitulated in our experimental settings. In general, blebbistatin-treated cells exhibited protrusive structures in random directions with PIP2 enrichment in peripheral F-actin-rich regions (consistent with the LatB experimental data) and a higher (p=0.09) overall cell edge PIP2 staining vs vehicle-treated cells further underscoring the impact of actin cytoskeletal perturbation on PM PIP2.

      "The effects of PLCb3 silencing on Pfn1-dependent changes in the PIP2 levels are interesting. To gain better insight into the underlying mechanism, one could also check if the levels of active (phosphorylated) PLCb3 are affected upon Pfn1-depletion".

      Response: We would like to point out that unlike PLCg, PLCb is not activated by phosphorylation. While literature has documented that certain site-specific phosphorylations of PLCb by PKC (in a feedback manner) and PKA, these phosphorylation events, if at all, have inhibitory effect on PLCb activity. Since our data supports the model that Pfn1 loss leads to an increase in PLC-mediated PIP2 hydrolysis and downstream PKC activation, we feel that probing for such inhibitory feedback phosphorylation events will not provide any mechanistic insights.

      "In the 'Discussion', the authors speculate that Pfn1 H119E mutant may have more frequent interactions with PIP2 as compared to wild-type Pfn1. This does not make much sense, because Pfn1 binding to PIP2 is very weak (e.g. ref. 28), and it is unlikely that introducing a negativelycharged glutamate would increase its affinity to negatively charged headgroup of PIP2. Thus, it seems unlikely that Pfn1 would affect the PIP2 content of plasma membrane through direct interactions with PIP2".

      Response: __We did not mean to imply that glutamate substitution of H119 residue would necessarily increase Pfn1's __intrinsic affinity to negatively charged PIP2. While PIP2 binding of WT vs H119E-Pfn1 has never been quantified in biochemical assays, we previously (Bae et al. PNAS 2010; PMID 21115820) showed that H119E substation does not affect the membrane fraction of ectopically overexpressed Pfn1 in cells. Along this line, Pascal-Goldschmit and colleagues (PMID: 7673143) also showed that analogous mutant H119D-Pfn1 inhibits PLCg-mediated PIP2 hydrolysis as efficiently as WT-Pfn1, further underscoring the fact that H119D/E-Pfn1 is not defective in membrane phosphoinositide binding. Our data largely supports a model that Pfn1-dependent PIP2 alteration is predominantly related to its actin-regulatory function. However, since Pfn1's binding to actin and PIP2 are mutually exclusive, we cannot absolutely rule out a minor (possibly insignificant) contribution of Pfn1's ability to block PIP2 hydrolysis by direct PM interaction. We therefore offered a hypothetical scenario where H119E-Pfn1 mutant may have more frequent interaction with PM PIP2 simply because it is not able to interact with actin. We have now better clarified this argument in the "Discussion" section of the revision.

      "The cell images in Fig. 2A are bit difficult to follow due to the large number of cells in the images. One could perhaps show higher resolution images with few knockout and rescue cells in the same field of view and indicate the rescued cells in these images e.g. with arrows".

      Response: As requested by the reviewer, we have now shown zoomed images in Fig 2A in the revision.

      "Please clearly describe in each figure legend what the error bars represent"

      Response: We have now clearly mentioned in the Statistics section of "Materials and Methods" that all error bars represent standard deviation unless explicitly mentioned otherwise.



      REVIEWER 2

      1. "The data show that actin binding-deficient mutants of Pfn1 do not rescue the knockdown. In these experiments, it is critical to quantitate the relative expression levels of the mutants. The model that Pfn1 regulation of PIP2 requires interactions with actin is not really clear - is it due to Pfn1 targeting by actin binding, or Pfn1 regulation of actin itself? Either possibility seems possible, and the experiments do not distinguish them". Response: We thank the reviewer for these comments. First, since GFP and Pfn1 rescue constructs are linked by an IRES, we analyzed GFP fluorescence intensity of cells selected for PIP2 analyses as a surrogate measure for comparing the relative expressions of Pfn1 rescue constructs across the various groups. As per these analyses (based on measurements of hundreds of cells from 3 different experiments), the average GFP expression of cells chosen for PIP2 analyses was found to be comparable between the various Pfn1 KO rescue groups (now shown in Fig 2C). Therefore, we argue that our observed phenotypic differences related to PIP2 are not confounded by the expressions of various Pfn1 rescue constructs.

      Second, it is known that Pfn1 loss leads to pronounced reduction in lamellipodial F-actin content (as shown in Figs 3A-B). Our LatB experimental data (Figs 3E-G) show that actin depolymerization leads to pronounced PM PIP2 reduction within minutes. Based on these findings, taken together additional evidence for increased basal PLC activity signature readouts in Pfn1-deficient cells (i.e. greater baseline PKC activity, greater PM DAG/PIP2 ratio from biosensor studies as recommended by the reviewer (new data - shown in Figs 6J-M)), we postulate (concurring with Reviewer 3) that disruption of cortical cytoskeleton (possibly also accompanied by removal of PIP2-binding adaptor proteins) may enhance PIP2's accessibility to hydrolytic enzymes. In fact, two previous studies (Cho et al., PNAS, 2005 and Andrade et al., Scientific Reports 2015) have demonstrated that actin filament disruption increases PM mobility of PIP2. There is also evidence for actin depolymerization-induced uncaging of PLC from the cortical actin network (Huang et al, Planta, 2009). Therefore, in principle, Pfn1 loss may cause more frequent PLC-PIP2 interaction and enhance baseline PIP2 hydrolysis by either increasing PM diffusion of PIP2 and/or uncaging of PLC. We have now included a schematic working model (Fig 7) to illustrate this concept and added these points in the discussion. However, a direct demonstration of increased PIP2 accessibility of PLC in Pfn1-deficient cells is beyond the scope of the present - this is something we will pursue in the future.

      "The knockdown data on PLCbeta is convincing with regard to its role in PIP2 reductions, but the papers does not explain how actin-Pfn1 interactions regulate PLCbeta".

      Response: Please see our detailed response to the previous comment that specifically addresses how we envision Pfn1 negatively regulates PLC-mediated PIP2 hydrolysis via modulating actin cytoskeleton.

      "The transcriptome data must be provided along with the data in Figure 5 - otherwise it is impossible for the reader to evaluate. The fact that the data is being used in another paper is not an adequate reason for its omission".

      Response: The transcriptomic data is now displayed in Supplementary Figure S3, where we have now listed top 100 (50 up, 50 down) differentially expressed genes in response to Pfn1 KO in MDA-231 cells (see panel B in Fig S2). We are in the process of submitting the FASTA file to GEO database.

      "The PKC substrate data is not convincing. The blots are messy, and there is no quantitation".

      Response: Since phospho-PKC substrate antibody is supposed to recognize all phosphorylated proteins by PKC, we expect to see multiple bands. The intensity of each lane in entirety is approximative of PKC activity by detecting proteins at multiple molecular weights phosphorylated at their serine residues. We have replaced the B16 generated data with a better-quality blot and added quantifications with statistical analysis (Figs 6B, D).

      "The calcium data should include statistical analysis of the differences".

      Response: We have now performed statistical analyses of the calcium data. Specifically, we compared the peak amplitude, integrated Ca2+ signal (area under the curve), and the post-stimulation resting value between control and Pfn1 knockdown groups. As per these analyses, we did not see any significant difference in either the peak amplitude or integrated Ca2+ signal between the control and Pfn1 knockdown groups, further underscoring the fact that Pfn1 loss does not necessarily confer cells an increased ability to respond to agonists (i.e. LPA-induced GPCR activation in this specific case). However, we noted that the post-stimulation resting Ca2+ signal was elevated in Pfn1-deficient cells relative to control cells (p2 hydrolysis and/or reduced re-uptake of cytosolic Ca2+ by endoplasmic reticulum and/or reduced efficiency of Ca2+ export. These analyses are now included in Figs 6G-I in the revision.

      "The discussion of DAG and PA levels is problematic. As the authors are aware, whole cell lipidomics can easily miss small changes in specific compartments. If the authors think that lipid sensor analysis of PM DAG and PA would strengthen the analysis, then this should be included. The large change in PC levels does seem to suggest an alternative source of PA. While the authors present arguments against a role for PLD, this could be directly tested. In any case, the finding of a nearly 100-fold greater change in PC than in PA raises question about what the whole cell PA measurements is really detecting".

      Response: We thank the reviewer for these comments and experimental suggestions__. First__, we completely agree with the reviewer that whole cell lipidomic analyses fail to detect small changes in specific compartment; we mention this point in the revision. In the revision, we have displayed our lipids of interest as individual line plots connecting control and Pfn1 KO group experiment-by-experiment to show the trend of lipid change in each experiment. As per these analyses, in 4 out 5 experiments, the total DAG increased in Pfn1 KO cells. However, the large experiment-to-experiment variability in the absolute content as well as Pfn1-dependent changes in DAG precluded us from achieving statistical significance between the two groups. The large variability in the measured DAG content in our experiments is not totally surprising since cellular DAG level is known to fluctuate with growth and/or impacted by unintended changes in the chemical parameters of culture condition. However, the largest pool of DAG is in ER/golgi, and since whole cell lipidomic measurements fail to reveal PM DAG due to PIP2 hydrolysis, as per reviewer's recommendation, we now include lipid biosensor experimental data (Fig 6J-M) of control vs Pfn1 knockdown HEK-293 cells to demonstrate that PM DAG-to-PIP2 ratio (an indicator of the basal PIP2 hydrolysis efficiency) is increased upon Pfn1 depletion. We believe that these new correlative PIP2/DAG biosensor data further strengthen our conclusion.

      Regarding the reviewer's comment on the orders of change in PC vs PA, we clearly mentioned in the original discussion that it is highly unlikely that PA increase in Pfn1-deficient cells is reflective of increased PLD-mediated conversion of PC for two reasons. First, we saw disproportionate orders of magnitude of changes in the content of PA (~3000 pmol/mg increase) vs PC (>200,000 pmol/mg decrease) in response to Pfn1 KO in MDA-231 cells. Second and more importantly, since monomeric actin directly binds to and inhibits the activity of PLD, the expected increased G-to-F-actin ratio in Pfn1-deficient cells, if at all, would likely result in diminished PLD activity reducing PLD-mediated conversion of PC to PA.

      In our opinion, since DAG is the direct hydrolysis product of PIP2 and we are now able to demonstrate elevated PM DAG-to-PIP2 ratio in Pfn1-deficient cells in biosensor experiments, PA biosensor studies are not necessary.

      REVIEWER #3

      1. "General: Scale bar labels are too small, please also provide time-stamps for time course measurements" Response: These concerns have been addressed in the revision.

      "As with every antibody stain, there is a remaining risk that a change in the cellular context affects an off-target of the antibody (e.g., a protein phosphorylation site). I think that this is not particularly likely, but I'd control for it, which can be done in a straightforward manner: The authors could do a strong-detergent treatment to rule out a potential off-target effect of the antibody (e.g., 0.1% Triton X-100, 1 h). This should remove all (non-amino-) lipids from the sample, including the phosphoinositides. Overall, binding of the antibody should be strongly reduced, fluorescence images should be much dimmer & the effect of the Pfn1 KO should mostly disappear."

      Response: The PIP2 antibody used in the present study is a well-vetted and widely used antibody in literature. Notably, two papers published by Dr. Hammond (one of the co-authors), an expert in phosphoinositide signaling, previously showed selectivity of this antibody by blocking with lipids, neomycin, and PH-domain of PIP2-binding proteins (Hammond et al, J. Cell Sci, 2006; Biochem J. 2009). We cite these papers in the revision.

      "Figure 1: Please show images in a larger zoom, cell details are barely visible (same for Figure 3). I also would not use "PM PIP2 levels" in the legend, as nuclei appear visibly lighter, indicating that some PIP2 is likely present in other membranes. The type of PIP2 staining should be specified in either the Figure itself or in the legend."

      Response: We would like to clarify here that we used an automated pipeline in "cell profiler" to detect cell edges and quantify integrated PM intensity of PIP2 in control vs Pfn1 knockout (KO) cells; so nuclear membrane PM is not accounted for in the analyses. We have zoomed PIP2 images in Figure 1 as the reviewer suggested. These changes are incorporated in the revision.

      "Figure 3: Same comment as for Figure 1, zoomed images would really help, especially for the PM/Cytosol distribution of the PIP2 biosensor"

      Response: Zoomed images of Fig 3 have been provided in the revision.

      "The lag time in the dissociation of the PIP2 sensor is interesting, as is the fact that the kinetic of PIP2 biosensor release is (visually) slower. I recommend to do a couple of simple fits to quantify these effects. If my impression holds, this would be a strong support of the author's interpretation that actin depolymerization actually leads to a loss of PM PIP2 - a simple binding/unbinding kinetic would be much closer to the actin depolymerization kinetic".

      Response: As suggested by the reviewer, we have done curve fitting of these data to calculate the half-life of F-actin and PIP2 (results shown in Fig 3H). As per these calculations, the mean half-life of PIP2 (~ 1min) is significantly longer than that of F-actin (~2.2 min) which further supports our interpretation that actin depolymerization leads to a loss of PM PIP2.

      "Figure 4: Same comment as for Figures 1 and 3, zoomed images would be most helpful."

      Response: Zoomed images have been provided in the revision.

      "Figure 5G: It looks like the two conditions were internally normalized. Given that we're looking at differential levels of PIP2/IP3/DAG, I think it is very possible that baseline Ca levels are also different. I'd either report in au or do a global normalization which would also capture any difference between the two conditions. This should also clarify whether there are differences in post-stimulus steady state Ca levels, as it currently looks like".

      Response: Since we used a transfectable Ca2+ biosensor (GCaMP), to account for cell-to-cell variation in the actual expression of the biosensor, we had to baseline-corrected GCaMP fluorescence by normalizing each kinetic datapoint readout to the average pre-stimulation value on a cell-by-cell basis. However, we have now performed additional analyses. Specifically, we calculated the peak amplitude, integrated Ca2+ signal (area under the curve), and the post-stimulation resting value for each of the two groups. As per these analyses, we did not see any significant difference in either the peak amplitude or integrated Ca2+ signal between the control and Pfn1 knockdown groups, further underscoring the fact that Pfn1 loss does not necessarily confer cells an increased ability to respond to agonists (i.e. LPA-induced GPCR activation in this specific case). However, we noted that the post-stimulation resting Ca2+ signal was elevated in Pfn1-deficient cells relative to control cells (p2 hydrolysis and/or reduced re-uptake of cytosolic Ca2+ by endoplasmic reticulum and/or reduced efficiency of Ca2+ export. These analyses are now included in Figs 6G-I in the revision.

      "Please increase the font size in Figure 6C, this is barely readable".

      Response: We have now replaced that panel with one with bigger font texts.


      "Do the authors think that most PIP2 is actually in lipid-protein complexes and actin depolymerization with the corresponding removal of PIP-binding adaptor proteins exposes previously shielded PIP2 molecules to enzymatic hydrolysis?"

      Response: Yes, we certainly think that is the most likely scenario. Please see our detailed response to Reviewer 2's comment #1. We have now clearly included this in the discussion and added a schematic mechanistic model to better illustrate our thinking (Figure 7).

      "The lipidomic changes are extremely interesting. This could indicate a change in overall cellular architecture which goes beyond PIPs. SM/Chol/PC all go down - I'd interpret that this as a relatively lower content of Plasma membrane and ER. It would be interesting to see if the surface to volume ratio of the cell changes - a comparison with total Cardiolipin as a proxy for mitochondrial membrane size could also be informative. It may very well be that the Pfn1 KO effects on structural membrane lipids are the more important finding - but elucidating that mechanism is beyond the scope of the current manuscript. I look forward to learning about it in the next story".

      Response: We thank the reviewer for this insightful comment. However, this is something we would consider as a scope of future studies.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Orenberg et al. is a well done, well-written paper that provides an in-depth look at the effects of Pfn-1 depletion on PIP2 levels, actin polymerisation and the broader lipidome. I enjoyed reading it, the main conclusions are sound and well-taken & the finding that PIP2 levels correlate with actin polymerization is intriguing as well as the fact that the global lipid to protein ratio changes. This is indicative of the identification of a major player in lipid flux pathways. I have just a few suggestions for control experiments, formulations & figure layout changes that I think will make the paper even better:

      • General: Scale bar labels are too small, please also provide time-stamps for time course measurements.
      • As with every antibody stain, there is a remaining risk that a change in the cellular context affects an off-target of the antibody (e.g., a protein phosphorylation site). I think that this is not particularly likely, but I'd control for it, which can be done in a straightforward manner: The authors could do a strong-detergent treatment to rule out a potential off-target effect of the antibody (e.g., 0.1% Triton X-100, 1 h). This should remove all (non-amino-) lipids from the sample, including the phosphoinositides. Overall, binding of the antibody should be strongly reduced, fluorescence images should be much dimmer & the effect of the Pfn1 KO should mostly disappear.
      • Figure 1: Please show images in a larger zoom, cell details are barely visible (same for Figure 3). I also would not use "PM PIP2 levels" in the legend, as nuclei appear visibly lighter, indicating that some PIP2 is likely present in other membranes. The type of PIP2 staining should be specified in either the Figure itself or in the legend.
      • Figure 3: Same comment as for Figure 1, zoomed images would really help, especially for the PM/Cytosol distribution of the PIP2 biosensor.
      • The lag time in the dissociation of the PIP2 sensor is interesting, as is the fact that the kinetic of PIP2 biosensor release is (visually) slower. I recommend to do a couple of simple fits to quantify these effects. If my impression holds, this would be a strong support of the author's interpretation that actin depolymerization actually leads to a loss of PM PIP2 - a simple binding/unbinding kinetic would be much closer to the actin depolymerization kinetic.
      • Figure 4: Same comment as for Figures 1 and 3, zoomed images would be most helpful
      • Figure 5G: It looks like the two conditions were internally normalized. Given that we're looking at differential levels of PIP2/IP3/DAG, I think it is very possible that baseline Ca levels are also different. I'd either report in au or do a global normalization which would also capture any difference between the two conditions. This should also clarify whether there are differences in post-stimulus steady state Ca levels, as it currently looks like.
      • Please increase the font size in Figure 6C, this is barely readable

      For the discussion:

      • Do the authors think that most PIP2 is actually in lipid-protein complexes and actin depolymerization with the corresponding removal of PIP-binding adaptor proteins exposes previously shielded PIP2 molecules to enzymatic hydrolysis?
      • The lipidomic changes are extremely interesting. This could indicate a change in overall cellular architecture which goes beyond PIPs. SM/Chol/PC all go down - I'd interpret that this as a relatively lower content of Plasma membrane and ER. It would be interesting to see if the surface to volume ratio of the cell changes - a comparison with total Cardiolipin as a proxy for mitochondrial membrane size could also be informative. It may very well be that the Pfn1 KO effects on structural membrane lipids are the more important finding - but elucidating that mechanism is beyond the scope of the current manuscript. I look forward to learning about it in the next story.

      André Nadler

      Significance

      The manuscript by Orenberg et al. is a well done, well-written paper that provides an in-depth look at the effects of Pfn-1 depletion on PIP2 levels, actin polymerisation and the broader lipidome. I enjoyed reading it, the main conclusions are sound and well-taken & the finding that PIP2 levels correlate with actin polymerization is intriguing as well as the fact that the global lipid to protein ratio changes. This is indicative of the identification of a major player in lipid flux pathways.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      1. The data show that actin binding-deficient mutants of Pfn1 do not rescue the knockdown. In these experiments, it is critical to quantitate the relative expression levels of the mutants. The model that Pfn1 regulation of PIP2 requires interactions with actin is not really clear - is it due to Pfn1 targeting by actin binding, or Pfn1 regulation of actin itself? Either possibility seems possible, and the experiments do not distinguish them.
      2. The knockdown data on PLCbeta is convincing with regard to its role in PIP2 reductions, but the papers does not explain how actin-Pfn1 interactions regulate PLCbeta.
      3. The transcriptome data must be provided along with the data in Figure 5 - otherwise it is impossible for the reader to evaluate. The fact that the data is being used in another paper is not an adequate reason for its omission.
      4. The PKC substrate data is not convincing. The blots are messy, and there is no quantitation.
      5. The calcium data should include statistical analysis of the differences.
      6. The discussion of DAG and PA levels is problematic. As the authors are aware, whole cell lipidomics can easily miss small changes in specific compartments. If the authors think that lipid sensor analysis of PM DAG and PA would strengthen the analysis, then this should be included. The large change in PC levels does seem to suggest an alternative source of PA. While the authors present arguments against a role for PLD, this could be directly tested. In any case, the finding of a nearly 100-fold greater change in PC than in PA raises question about what the whole cell PA measurements is really detecting.

      Significance

      The manuscript by Orenberg et al. is an extension of previous work showing a link between Pfn1 and PM PIP2. While the new data expand the observations, and the PIP2 biosensor data are clean, the proposed model is not really convincing or fully defined - a number of elements are suggestive but not definitive. Several of the data could have multiple explanations (some of which are acknowledged in the discussion). The overriding hypothesis is that Pfn1-actin coupling regulates PLCbeta, but it is not clear how this would happen. Finally, several of the data are not convincing (PKC substrates) or lack statistical analysis (calcium imaging).

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Earlier studies have shown that actin-binding protein, profilin, can inhibit the PLC-dependent hydrolysis of PIP2 in vitro and provided evidence that acute profilin-1 (Pfn1) knockdown results in diminished PIP2-levels at the plasma membrane. However, the underlying mechanism by which profilin regulates PIP2-levels in cells has remained elusive. Here, Orenberg at al., show that Pfn1-dependent changes in the plasma membrane PIP2 levels are not transient. Interestingly, they also provide evidence that Pfn1 controls plasma membrane PIP2 levels through its actin-regulating activity and not through directly interacting with PIP2. Finally, they show that loss of Pfn1 also affects the levels of many other lipids in cells.

      Majority of the data presented in the manuscript appear of good technical quality, but I have some suggestions to strengthen the manuscript.

      1. The quantifications of the PIP2 levels were apparently done simply by measuring the fluorescence intensities of wild-type and knockout cells stained with monoclonal actin-PIP2 antibody. However, the knockout cells appear more spread compared to the wild-type cells (Fig. 1B), and this can possibly affect the quantifications (e.g. there may be more plasma membrane ruffles/folds in the wild-type cells). Thus, I recommend that in all critical quantifications the authors would also use a general plasma membrane marker to confirm that the PIP2-density (and not just morphology of the plasma membrane) is indeed affected by Pfn1-depletion.
      2. To get a better idea about which cellular actin filament structures are important for regulating the PIP2-levels at the plasma membrane, one could also use a larger repertoire of actin/myosin inhibitors (CK666, cytochalasin-B, blebbistatin). By using these compounds, one may e.g. uncover if the Arp2/3-nucleated branched actin networks and/or contractile actomyosin structures would specifically contribute to regulation of the plasma membrane PIP2 levels.
      3. The effects of PLCb3 silencing on Pfn1-dependent changes in the PIP2 levels are interesting. To gain better insight into the underlying mechanism, one could also check if the levels of active (phosphorylated) PLCb3 are affected upon Pfn1-depletion.
      4. In the 'Discussion', the authors speculate that Pfn1 H119E mutant may have more frequent interactions with PIP2 as compared to wild-type Pfn1. This does not make much sense, because Pfn1 binding to PIP2 is very weak (e.g. ref. 28), and it is unlikely that introducing a negatively-charged glutamate would increase its affinity to negatively-charged headgroup of PIP2. Thus, it seems unlikely that Pfn1 would affect the PIP2 content of plasma membrane through direct interactions with PIP2.
      5. The cell images in Fig. 2A are bit difficult to follow due to the large number of cells in the images. One could perhaps show higher resolution images with few knockout and rescue cells in the same field of view and indicate the rescued cells in these images e.g. with arrows.
      6. Please clearly describe in each figure legend what the error bars represent.

      Significance

      Although this study does not determine the actual mechanism/pathway by which Pfn1 controls plasma membrane PIP2 levels, it nevertheless provides evidence that perturbation of the actin cytoskeleton by loss of actin-binding profilin or with an actin inhibitor latrunculin-B results in decrease in the plasma membrane PIP2, and that PLC-activity is critical for regulation of PIP2 levels downstream of Pfn1 in cells. Therefore, this study presents a valuable contribution to a specific field, and will be interesting to those studying the actin cytoskeleton - plasma membrane interplay.

      My expertise: Cytoskeleton research.

  4. Apr 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      __Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      __Summary: Overall, this study adds a large amount of data for the scyphozoan Aurelia coerulea by producing several single-cell RNA sequencing libraries that cover the transition from polyp to medusa. The study provides a modern view of cell type diversity and cell-specific transcriptome changes during this period of extreme morphological change in this particular cnidarian lineage, which is understudied. Certain unique cell subtypes, including neural cell subtypes and muscle cell subtypes which are specific to different life stages are discussed in detail providing some new insights.

      My overall assessment is that the manuscript has good potential to be impactful, but in its current form it is somewhat clunky and overly complex to read, the figures were too crowded and difficult to comprehend, and the authors did not provide enough context regarding the current state of knowledge and what this study adds to it. In particular, Figure 1 and the section about striated and smooth muscles sharing partial transcriptomic profiles need the most work. The results were presented in the context of the anthozoan Nematostella but this should be broadened further to include other cnidarian single-cell studies, such as those from Hydra and Clytia which are both medusozoans like Aurelia. The writing throughout could be streamlined and simplified to better highlight the major findings as described in the abstract of the paper. Several figures were not well presented or clear and could be improved or decluttered to better communicate and support important results. In addition, some methods were totally missing, and I was unable to access the github repository associated with the paper which should detail all analyses described in the paper. In its current form, reproducibility of analyses would be quite limited. I did greatly appreciate the inclusion of the data on the UCSC Cell Browser, which allows anyone to access the single cell data matrix for visual exploration.

      Answer: We thank the reviewer for the overall positive assessment and have tried to address all of the comments that follow.

      Major comments: The Introduction section was very short - only three paragraphs. I feel that this section could be expanded to give more context about Aurelia as a research organism, and the current resources available. This includes genomic and transcriptomic resources particularly those focused on the transition between life cycle stages (polyp to medusa). Any other relevant background on cell type diversity or if there is anything known about the molecular profile of specific cell types found in different life stages should also be included here . Do marker genes already exist for some of the important cell types discussed in the manuscript? It would be better to present the current state of knowledge, and context for why this study was done, how it builds upon current knowledge, and what it adds to our current understanding so that the study is properly framed from the beginning.

      Answer: Introduction was expanded and also includes explanations to which extant medusa specific cell-types were investigated so far. This additional information is highlighted in blue typeface in the manuscript.

      In the Results section, I find the sentence on p. 4, "Further, ~70% of these gene models do not have readily identifiable orthologs and thus represent putative orphan genes" to be rather confusing. What analysis was performed to determine this percentage, and which set of organisms were compared? Doesn't this percentage seem rather high for a cnidarian? Or is this referring to orthologs outside of cnidaria? Please comment further on how this percentage was determined and possible explanations for it being this high. Right now, it just feels tacked on to this paragraph with no context or further explanation which leads to the confusion.

      __Answer: __This statement originally referred to a lack of any best-blast-hit nor any protein domain annotation found for the sequence. This number has dropped to only 47% with the most recent mapping tool, which is a value also fairly commonly found in other animal genomes. Nonetheless this statement has been removed from the manuscript.

      Figure 1. There are many issues with this figure that encompass how I felt generally about the figures of the paper. The figure should ideally take up the entire width of the page rather than squishing some text next to the figure.

      __Answer: __The figures are intended to be a full page, they are also included embedded into the text to facilitate review of the manuscript and the full-resolution figures are included for proper review. In the revised version we have kept this comment in mind to ensure the figures are legible.

      Figure 1A: The colors of the different developmental stages from which tissue was samples (e.g. polyp1, polyp2, polyp.clover) do not seem to match between legend and figure. For example, the "polyp.clover" stage is circled in blue in the schematic, but given a green dot in the legend. The "medusa.manubrium" is circled in orange in the schematic, but given a purple dot in the legend. Suggest making the colors match between legend and schematics.

      __Answer: __ The colors correspond to the grouped stages and colour palette used for the life cycle stage divisions. This has been considered in the revised figure

      Figure 1E: In Panel E, the labels showing that the top graph is "polyp" and the bottom graph is "medusa" are much too small. Increase the font size of the labels. The font size for the GO terms themselves are also too small.

      __Answer: __This figure has been removed in the revision; Attention has been paid to font sizes in the revised figures.

      Figure 1F: The bulk of this study centers around the single-cell RNA sequencing data and resulting analyses from these data. As such, I would expect the cellular atlas resulting from these data to be similarly highlighted. In Figure 1F, the annotated cell atlas as presented is much too small, making it impossible to even add the labels for the different clusters directly on the UMAP. Suggest increasing the size substantially to at least half of the page width, so that it is possible to do so.

      __Answer: __This has been removed in the revision; the full distribution of the identified clusters is now figure 2. We do not include all of the population sub-types on the UMAP in this figure as this is simply a visualization tool and the distribution of the sub-types on that map is not necessarily informative. Rather we include the relative proportions of the sub-types/states in the bar plot, and the relationships between these clusters in the tree.

      -There should also be a complimentary figure in the supplement that shows all of the individual clusters, each in different colors and clearly annotated with labels, rather than just showing multiple clusters that were combined into the major cell types. There is an example of this in the Clytia single cell paper (see Chari et al. 2021 Figure 2A vs Fig S9).

      __Answer: __A fully coloured UMAP with all cell states is available in the supplement figure S3

      -The graph on the right of this panel showing the "Distribution of cell types in time and space" is overly complicated with all of the colors and the meaning is quite lost as it is quite difficult to interpret at this very small size. Suggest removing and possibly showing as a supplemental figure so that it's meaning is easier to assess.

      __Answer: __This barplot is now larger and includes both the partitions (major cell populations, as seen in the UMAP) and proportion of individual cell clusters. We feel this is an intuitive way to illustrate the relative distributions of all cell type states across the dataset as a whole and so we keep this in the main figures of the manuscript.

      -In addition, striated muscles are marked on the overall UMAP; however, it is not noted until later that the smooth muscles are part of the "outer epidermis" cluster. Suggest altering the legend or the text of the figure itself to show where the smooth muscles are thought to be in the overall UMAP, especially since they are specifically discussed in depth later in the manuscript. Exactly which "part" of the outer epidermis cluster includes the smooth muscle cells?

      __Answer: __We have added the smooth muscle cluster in the main figure umap.

      Figure 1G: Panel G, for example, is not useful in conveying its point as the text labels are too tiny and the figure is overly complex to be squished into a panel of this figure. Suggest removing and making 1G a supplemental figure by itself or perhaps together with 1C (as they are linked) where it is more legible. The figure legend text for Fig 1G is also confusing as it refers to "scyphozoa" in (C) but there is no "scyphozoa" in 1C, only "medusa".

      __Answer: __This is now Figure 1D and E and is given increased space in the figure. We feel the message that the medusa-specific gene set is not restricted to medusa-specific cell types is an important one and so we have kept this in the main figure. We provide a table with all gene annotations in the supplement so that it is accessible to anyone with further interest (DS1.1a and DS1.1b).

      Text, p. 6: The explanation for how the clusters were annotated in Fig 1 and Fig 2 is much too vague. The text states, 'We identified 9 broadly defined cell populations, for which we assign identities by assessing up-regulated gene lists (Data S1.3)." What does this mean? How exactly were the up-regulated gene lists assessed? This needs to be clarified further. What genes were used to label these clusters or groups as particular cell types? How does the annotation relate to Supplemental Tables S1.3 and S1.3b? Does the previous literature need to be cited to support these annotations based on specific genes? Suggest doing a better job overall and providing more detail and context explaining how the single cell clusters were annotated.

      __Answer: __We have expanded our description of how we assigned identities to the nine principal cell type families as follows:

      (pg. 8) The inner epithelia, or gastrodermis, expresses several collagens that are a characteristic of the inner cell layer of anthozoans (39); the outer cell layer houses the ring musculature and is rich in contractile proteins. The striated muscle cluster is also rich in contractile protein and is the only principal cell population absent from the polyp-derived samples (Fig. 2C). The mucin gland expresses mucin-like-proteins, whereas the digestive gland expresses other digestive enzymes, and the neural cluster expresses synapsin and other conserved known neural regulators such as ashA. The cnidocytes express mini-collagens and are enriched in pathways targeting the endoplasmic reticulum (40).

      Text, starting on p14: "Striated and smooth muscles share partial transcriptomic profiles." This section is highly confusing and could do with some simplification in both text and figures. - The genes for which expression is shown in Fig. 5, 6 and 7 are not properly introduced or given nearly enough context in the text. For example, the text states, "To investigate the dynamics of muscle formation, we further compared phalloidin staining of muscle fields with in situ hybridization detection of specific cluster marker expression in polyps (Fig. 5), strobila (Fig. 6), and ephyra (Fig.7)." However, it is not until the legend of Figure 7 and also much later in the text (in the Discussion, p23) that it is noted what types of muscles each of the genes used in ISH actually mark ("While a small set of genes are shared across the two muscle phenotypes (e.g. stmyhc1 and mrlc2), others are more specific to either phenotype (eg. stmyhc5 in striated muscle; myophilin-like-2 in smooth muscle) (Fig.8A), which were verified by in situ hybridization (Figs.5,6,7)". This needs to be rewritten and improved for flow and clarity purposes.

      Answer: Figure 5,6 and 7 were re-assembled in a different structure according to reviewers suggestion. Specifically, we now present the muscle anatomy together first, followed by molecular validations from the atlas data. Marker genes used for in situ hybridization (ish) were introduced as suggested. Text was re-written according to changes in figures. In general, figures and text were simplified to gain more clarity on the muscle chapter.

      • Suggest that the authors show an overall UMAP of smooth and striated muscle (perhaps the smooth muscle subtypes are part of the large 'outer epidermis' cluster; see the comment for Figure 5B above), and then include featureplots that show the expression of each of the genes used in ISH in these clusters. This might make it clearer as to what type of muscle the genes should be highlighting within each developmental stage. It might look something similar to what is shown in Figure 7P (although it is unclear how the featureplots shown in this figure relate to the UMAP shown in Figure 5B). In addition, the featureplots in Figure 7P only show 3 out of the 4 genes used in ISH which is not helpful. Featureplots should be clearly shown for all genes discussed. This is essential to linking the pattern in the single-cell data to the expression data and is the minimum required to provide clear understanding.

      Answer: We took this suggestion under consideration when re-compiling the figures. Now the feature plots and the insitu’s are found in the same figure (Figure 6).

      • The text reads, "To investigate the dynamics of muscle formation, we further compared phalloidin staining of muscle fields with in situ hybridization detection of specific cluster marker expression in polyps (Fig. 5), strobila (Fig. 6), and ephyra (Fig.7)." However, Figure 6 also contains images of ephyra (Fig6. P-S). Suggest that those panels could be included in Figure 7.

      Answer: This text no longer appears in the manuscript. The relevant section now reads as follows (p15:17):

      “We assessed the anatomic location of the muscle fields by phalloidin staining in Aurelia polyps, strobilae and ephyrae (Fig.5). Polyps have three distinct smooth muscle fields (Fig. 5A,B-G): the radial muscles of the oral disc (Fig. 5D), the longitudinal tentacle muscles (Fig. 5E), and the longitudinal retractor muscles that run along the body column (Fig. 5F,G (35)). During strobilation, fragments of the polyp retractor muscles are retained in the early ephyra (Fig. 5J (35)). Striated muscles appear coronally around the oral disc, oriented radially along the lappets of early detached ephyra (Fig. 5L-N). At the tips of the lappets, the border of the coronal muscle, and at the base of the manubrium, fibres show a mixed organization of smooth and striated myofibrils (Fig. 5O,P). These findings corroborate previous studies that used light- (26) or electron microscopy (24,25).

      We next compared expression patterns expected from our single cell data with the phalloidin-based anatomy of smooth and striated muscles. As expected, several genes were shared between the smooth and striated muscle cluster (Fig.6E), while others were highly specific to either smooth (Fig.6C,D) or striated muscle cluster (Fig.6P; Data S1.11). Different calponin paralogs show distinct expression in the different muscle types (Fig. 7A). For example, calponin1 is specific to the smooth retractor muscle of the polyp and no other subpopulation of the smooth muscle type (Fig. 6A-C). At the strobila stage, expression of calponin1 is still visible in fragmented retractor muscles, consistent with the single cell expression profile (Fig. 6F). By comparison, mrlc2 expression marks the locations of all smooth muscle populations in polyps including tentacle muscles, radial muscles of oral disc and retractor muscles of the body column (Fig. 6D,E).”

      • There are parts of this section text where reference to the Figures is complicated and not easy for the reader to follow. I got particularly confused in trying to follow this part of the manuscript. For example, a sentence on p15 reads, "mrlc2 and stmyhc1 reads are detected in both muscle types (Fig. 7pFig. 5M, Fig 6C,E,G-P, Fig. 7J-L,N-P), and ISH indicates that the expression is localised to the fields of striated muscles in ephyrae (Fig.7J,K,N), as well as the smooth muscle populations in polyps including longitudinal tentacle muscles, radial muscles of oral disc and retractor muscles of the body column (Fig. 5M, Fig.6H,I,L,M), and the muscles of the manubrium in the meta-ephyra (Fig. 7L,O)." It is quite difficult to keep jumping between Figures and panels to look at this. A better organization of the Figures and much clearer text that doesn't jump around could go a long way to making it easier to follow.

      Answer: __ We thank reviewer 1 for the suggested changes. We feel that recombining the results from previous versions of the figures helped to improve the clarity in this section. Single cell data was updated to include an UMAP of the muscle subset and gene expression plots highlighting the differential expression in either smooth- striated or both muscle types corresponding to the in situ hybridization (ish) gene expression profile. The figure (__Fig. 6) is now arranged in a way that allows the reader to easily follow the results for the spatial validation of both muscle types since ish for all life stages is shown in one panel together with the muscle subset UMAP and gene expression plots. Additionally, the two muscle clusters are now labelled also in (Fig. 2A) to provide a better understanding for the reader where muscle clusters are located in the UMAP of the full object.

      The text reads now: (Fig. 6, figure caption): (Q) feature plots of all marker genes on the muscle specific subset (R) reference UMAP of whole dataset (left) subset (right) (S) Distribution plot of muscle types across the different Aurelia life stages (left) and medusa tissues (right).

      Discussion -The authors do try to put their results into context with the two Aurelia genome papers (Gold et al. 2018, and Khalturin et al. 2019) and two additional bulk transcriptome studies (Fuchs et al. 2014, Brekhman et al. 2015), but not until the first part of the Discussion. In principle, this would be fine. However, in practice, their discussion of these studies is somewhat vague and generalized and did not really provide a clear review or analysis of how adding in cell-type specific data is helping our understanding. The argument about how their results fit with previous findings was confusing and unclear. They start by discussing "genome usage" but then switch to talking about cell type diversity across life stages. The connections between "genome usage", "gene representation", and cell types was not easy to follow. Suggest rewriting this section to clearly discuss the findings in this manuscript in the context of previous studies with straightforward and precise language.

      -In the discussion about the neural subtypes, comparisons are only made to Nematostella where there are also two major neural classes. It would be even better to include discussion of single-cell data related to neurons in other cnidarians, such as Hydra, where there is detailed discussion of neuron subtypes in both a published manuscript (Siebert et al. 2019, Science) and a preprint (Primack et al. 2023, biorxiv) and Clytia (Chari et al. 2021, Science Advances). I do see that Clytia and Podocoryna are mentioned in the next section of the Discussion, specifically related to the Otx gene.

      Answer: We thank the reviewer for this oversight. We have incorporated comparative observations from the published Hydra dataset in this regard.

      Pg 21 “ This contrasts with the distribution of n1 and n2 class neurons in the freshwater hydozoan polyp Hydra vulgaris, of which only three of the fifteen sub-types are of the ins-positive n1 type (“ec2”, “en2”, and “en3”: Fig. S8D; (58)). Similarly in the Clytia medusa only one of the three neuron groups (neuron cells “A” (16) have INSM reads and thus could be considered type 1 neurons as defined here.”

      -The section about muscle subtypes in the Discussion would need to be rewritten in accordance to changes suggested above for the Results for this section.

      Answer: Discussion was rewritten according to the changes made in the results section like suggested by reviewer1.

      Materials and Methods -In the section "Comparison with Nematostella" the authors discuss running OMA to generate the set of identified 1:1 orthologs but never go on to mention how many orthologs were identified. Please report this number so it is clear whether this is a small or large subset of the total analyzed. In a recent study of the Hydra AEP strain (Cazet et al. 2023 Genome Research), a similar analysis was done between Hydra and Clytia and they found 5979 genes with 1:1 orthologs between the two species. There should also be a supplemental datasheet that provides a list of these orthologs (See Supplemental Data S17 provided in Cazet et al. 2023 as an example). I am curious to know how many 1:1 orthologs were found between Aurelia and Nematostella. I would expect there to be a smaller overall number than between Hydra and Clytia due to the larger phylogenetic distance between these two taxa. I also strongly suggest that the Cazet et al. 2023 paper should be referenced, as it was the first time an attempt to compare single-cell datasets between two cnidarian species was done. The current manuscript took an alternative approach to comparing Aurelia to Nematostella, so it would be good to acknowledge this and justify the methods used in this manuscript compared to those used in Cazet et al. 2023.

      Answer: We recognize our oversight in not properly referencing the previous study comparing two cnidarian species and have integrated this reference now, and include the requested information regarding our OMA analysis as follows:.

      In total 4311 1:1 gene orthologs between the two species were identified (Data S2.). A similar comparison using OrthoFinder (90) between Hydra and Clytia, both members of the Hydrozoa clade, found 5979 1:1 orthologs (66). OMA was preferred in this study over other available orthology databases because it outputs a high-confidence predicted 1:1 gene orthology list that can be used directly to combine multi-species data.

      -There are missing descriptions of methods throughout the paper. One example is in the section about Transcription Factor families that are over or underrepresented amongst upregulated genes compared to their distribution in the genome - I could not find any description of the methods used to identify these Transcription Factor families in the dataset of Aurelia upregulated genes. How were these families chosen? How were they identified in this dataset?

      Answer: Transcription factors were identified and classified using the Animal Transcription Factor Database version 4. (https://guolab.wchscu.cn/AnimalTFDB4/#/). This information has been added to the manuscript methods.

      -I noticed in the Data and materials availability statement and a few other places in the manuscript, a github repository was mentioned: https://github.com/technau/AureliaAtlas. I tried to access this repository to review what was included, but unfortunately it is not accessible. I found seven repositories within github.com/technau but the AureliaAtlas was not one of them. This repository should include all scripts to generate all figures and other analyses in the paper and should be made available to reviewers to better understand exactly how all analyses were completed. A good example of how this could be done is found in the repository related to Cazet et al. 2023 (https://github.com/cejuliano/brown_hydra_genomes), which is very comprehensive and easy to follow. -When I looked through a similar repository https://github.com/technau/CellReports2022/ from the Steger et al. 2022 Cell Reports Nematostella single-cell paper from this same group, I find it to be rather disappointing. They apparently included all code to generate all figures in a single R file that is not easy to follow and not well commented. If this is the same strategy used for this manuscript, I feel that a much stronger effort could be made to make the analyses of this Aurelia manuscript transparent by producing a github that is more like that of https://github.com/cejuliano/brown_hydra_genomes from the Cazet et al. 2023 paper which organizes each type of analysis in a different github subfolder and within each subfolder they include very detailed information and comments explaining each step of each analysis. Doing this would go a long way to making the analyses in this manuscript more transparent and easier to follow and would certainly put some of my concerns to rest.

      __Answer: __We thank the reviewer for pointing this out. We have ensured that the github page is publicly accessible. We have provided all of the necessary R scripts to generate the analysis and figures. The structure is improved over the Steger paper; separate scripts are provided for each step, including importing and processing the raw data for the Seurat workflow, data processing to assess the life cycle and first clustering, analyses of each subset, and finally calling results from the previous scripts to generate all figures contained in the manuscript.

      Minor comments:

      Figures: Figure 2A: In the legend it says "Colour code as in (B) and (C)" but it's really referencing the colors in Figure 1A, correct? It is confusing to have to look back to Figure 1A to understand the colors here.

      __Answer: __The original figures 1 and 2 have been modified and combined into a single figure in this version.

      Figure 2D: Typo in the word "proteins" in the title of this panel.

      __Answer: __This word no longer appears in the revised figures.

      Figure 3F: The placement of the tree and the two featureplots for myc3 in Nematostella and Aurelia is confusing. Suggest moving the featureplot for Aurelia myc3 so that it is beside Nematostella (to the right of the tree) or move the featureplot for Nematostella myc3 so that it is beside the Aurelia featureplot (to the left of the tree).

      __Answer: __We thank the reviewer for this suggestion and have edited this figure accordingly by moving the myc3 expression plots alongside all of the others.

      Figure 4B: The description of this panel reads, "Distribution-histogram across all samples, medusa-specific cell clusters are highlighted with black outline.", however as a reader, the black outline is not very clear. Suggest making it bolder. In addition, this black outline is a little confusing - it should mark the medusa-specific cell clusters; however, the black outline appears in cell clusters in strobila and ephyra?

      __Answer: __ The black outline is now increased in width for clarity. Medusa-specific cell types are defined by their absence from the polyp samples because already in the strobila stage medusa-specific tissues are being generated and thus these transcriptomic profiles begin to appear. We added a clause in the figure legend to clarify this, as well as within the main text when medusa-specific cell states are first defined.

      Pg.8: “ In total we find 12 cell type states that are not represented (<br /> Figure 5B: It is unclear from where this reference UMAP was derived. Does it come from the overall UMAP, showing the 'outer epidermis' cluster only, with the putative smooth muscle cells in red? Or is it the 'outer epidermis' cluster plus the striated muscle cluster? Suggest making this clearer (see below for larger edits to this section of the manuscript).

      Answer: This has been addressed. Figure 6R now includes both the full dataset inset, as well as the muscle-only subset and is consistent with the rest of the manuscript in this regard.

      Figure 5K/L/M: It is unclear which parts of the polyp in K is used for the images shown in L or M. Both come from the large red box, but it is unclear from which part L and M were made. In addition, the subtraction of the background from the image (to make it look white) is distracting and makes the image itself look artificial.

      Answer: New brightfield images were included to give a better understanding of the region of interest. The images in which the background was subtracted were replaced with the original pictures and contrast was enhanced to brighten the background.

      Figure 6C, G-S: - Not sure what the blue boxes around these panels are meant to highlight? - Also not sure what the image in the left of panel C is. Perhaps an oral view of the strobila? The legend or panel itself should mention this. - Again, subtraction of the background from the image (to make it look white) in panels C, D and E is distracting and makes the image itself look artificial.

      Answer: The figure was redone and the boxes are not present anymore.

      Figure 6J, M, N, O: - For someone not accustomed to looking at images of strobilating polyps, it is unclear what part and what orientation these images are taken of. Suggest including some of these details in the figure legend at least. Fig 6O actually looks like an ephyra, but is annotated as an "advanced strobila"?

      Answer: Figure was re-done (fig.6) with appropriate schematics next to the images.

      Figure 7H: - Not sure what the white lines in this panel are meant to indicate?

      __Answer: __The white lines were removed.

      Results: p5 - In this sentence, "Because these four pouches look like a cloverleaf from above, we call this stage the "clover-polyp", suggest changing "clover-polyp" to match the Figure 1A (where it is written as polyp.clover), or change the text in the Figure to match the text in the manuscript.

      __Answer: __ We made sure to match this in the revised figure.

      p8 - In this sentence, "the bZIP protein family are over-represented as terminal cell type markers, while the number of zinc-finger proteins of the N2C2 class are under-represented", the "N2C2" class the authors refer to is not clear. Is there a typo here? In the figure to which this sentence refers (Figure 2D), the proteins referenced are "zf-H2C2" or "zf-C2H2".

      __Answer: __This no longer appears in the current manuscript.

      p9 - Typo - should be "medusozoans" rather than "medusazoans".

      __Answer: __This has been corrected.

      p11+ - Section titled, "Aurelia neural complement reveals two neural classes with similarities to anthozoan neurons" - I found the classification of N1 and N2 to be confusing, since initially they are described as neural clusters, however N1 in particular is shown to consist of primarily secretory, non-neural cell types. For example, when looking at Figure 4A and B, it is evident that N1 contains only a relatively small number of neural cell-types (in shades of orange), while most of the cells are other secretory, but non-neural cell types (in shades of brown). Not sure if the authors should alter the title to reflect this? For example, instead of 'neural' classes, they could be called 'neuro-secretory' or 'mixed neural and secretory classes'?

      __Answer: __We appreciate the confusion and have adjusted the heading accordingly. However we choose to maintain the designation as N1 and N2 class to reflect the distinction between insulinoma-positive and pou4-positive major Cnidarian neuroglandular sub-types present as defined in our earlier Nematostella work (Steger et al., 2018). We also include a comment in the discussion regarding the support for this distinction in other published Cnidarian dataset as follows.

      ”This contrasts with the distribution of n1 and n2 class neurons in the freshwater hydozoan polyp Hydra vulgaris, of which only three of the fifteen sub-types are of the ins-positive n1 type (“ec2”, “en2”, and “en3”: Fig. S8D;(58)).”

      p11 - Text reads, "Class 1 neurons in the medusa are also most prevalent within the gastrodermis and manubrium, and includes one subtype that first appears in the strobila and is found in all medusa tissue samples ("n1.3.medusa"; lower black box Fig. 4F).", however there is no "lower black box" in Figure 4F apparent.

      __Answer: __Re-evaluation of the detectable cell states after updating the mapping tool, which addresses issues associated with an overabundance of isoforms, results in the dissolution of this putative medusa-specific cell state. This profile is also found within the polyp and so the second half of this sentence has been removed.

      p13 - The text reads, "We find that class 2 neurons all express elevated levels of specific alpha- and beta- tubulins (TBA1-like3 and TBB-like-1; Fig. 4D).". Make the capitalization of your gene names (TBA1-like3, etc) consistent between text and figure throughout (in Fig. 4D the gene names are lower case).

      __Answer: __We have taken care to be consistent throughout the manuscript.

      p14 - In the first paragraph of this page, Fig. 4C is referenced twice, however both times the referencing sentence does not match this panel (most likely the authors meant to reference 4E, F or G).

      __Answer: __This has been corrected.

      p14 - The final sentence of this upper paragraph, "Specific tubulin-paralog expression within the class n2 neurons suggest that this is the portion of the nervous system labelled by the β-Tubulin antibody." is confusing. Do you mean that the b-tubulin antibody is most likely labelling the product of the tbb-like-1 gene that is shown in the featureplot in Fig 4D? Suggest rewriting this sentence for clarity.

      __Answer: __This sentence has been re-written as follows: “Specific tubulin-paralog expression within the class n2 neurons suggests that these two genes are translated into proteins recognised by this commercial β-Tubulin antibody. Furthermore, this antibody labelling suggests that the MNN is composed of N2 class neurons.” pg 14

      p14 - on this page and others in the manuscript, there are instances of the word "Aurelia" not being italicized.

      __Answer: __This has been corrected.

      p14 - In this sentence, "In the sea anemone Nematostella, anemone-specific gene duplications of members of the PaTH (Paraxis, Twist Hand-related) bHLH family of protein coding genes was driving the diversification of muscle cell types (29)." the "was driving" part of the sentence is grammatically clunky. Suggest rewording slightly. (e.g. "...protein coding genes drive the diversification of muscle cell type").

      __Answer: __We changed this to ‘drove’.

      -Myophilin-like2 in the text of the manuscript is written as myofilin-like2 in the figure panels (e.g. Fig 5L, Fig. 6D). Make consistent between text and figures.

      Answer: We changed all references to myophilin to calponin, which is the better known name of the vertebrate ortholog.

      p15 - on this page and several instances thereafter, "in situ" is not italicized as it should be.

      __Answer: __This has been corrected

      p19 - In the line, "Taken all together these data suggest that the contractile apparatus in the Scyphozoa, using here Aurelia as a proxy, is similar to the bilaterian smooth muscle contractile complex (Fig. 8C)." this should really reference Fig. 8 B-C

      __Answer: __This has been corrected according to the newest figure.

      Reviewer #1 (Significance (Required)):

      General assessment:

      I believe this manuscript adds a significant amount of useful data and provides some novel insights into scyphozoan cell types across an important life history transition from polyp to medusa in Aurelia. Adding the dataset to the USCS Cell Browser is a strength. I think there is the potential to make this an impactful paper but in its current form, it is pretty messy, and not clearly presented, and lacks some transparency. The greatest weaknesses lie in not framing the work adequately or putting it into enough context with previous work and also not relating it to other medusozoans; in the Figures which are overly crowded, and confusing rather than being clear and supporting the results; and in the lack of explanation for some methods like how cell clusters were annotated, how transcription factor families were determined; and the lack of access to the github data repository, which raises questions of reproducibility. It will take a good amount of restructuring figures and reframing to make the study clear and impactful and the methods and analyses reproducible.

      Advance: If the weaknesses are addressed adequately, this study does contribute new insights in the area of further understanding changes across an important scyphozoan life cycle transition in terms of diversity of cell types and their cell-type transcriptomes, opening up further questions which can now be addressed.

      Audience: The broader cnidarian community will be interested in this study. People studying cell type evolution and cell type novelty across the tree of life will also be interested. Anyone looking for examples of how to use modern approaches to understanding life cycle changes in animals will be interested.

      My expertise is in cnidarian cellular and molecular biology and evolution including working with model cnidarian research organisms and employing techniques and approaches similar to those used in this study.

      We thank this reviewer for their detailed comments and suggestions, and feel the manuscript is much improved in its current form. We hope that we have satisfied all concerns raised here.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      __This paper is well-written and serves as a valuable resource not only for the cnidarian community but also for researchers studying more broadly cell type identity and evolution. A key cell type enabling the transition from polyp to free-swimming medusa is the cnidarian striated muscle, which has only been morphologically identified in medusozoan jellyfish. While this study does not include functional analyses, it lays the foundation for the Aurelia research community to leverage single-cell atlas data for future investigations.

      Key experiments supporting the paper's main conclusions are missing :

      •At the beginning of the Results section, the authors mention identifying a previously undescribed developmental stage, which they name "clover-polyp" However, they do not later discuss whether this newly identified stage has a distinct gene expression signature. This point should be addressed in the paper or removed.

      __Answer: __We do not find any specific transcriptomic signature specific to this stage. We keep this designation as a morphological indicator of a strobilation-competent polyp, but have re-worded our introduction of this term as follows:

      “The first external sign of strobilation is the expansion of the body column into four pouches that are filled with multiple folds of inner cell layer epithelia (Fig. 1A), and resembles a cloverleaf from above; we call this stage the “clover-polyp”.”

      •A key reference is missing in the following sentences :

      "The anthozoan Nematostella vectensis has two principal neural sub-families that have been described that correspond to those with insulinoma expression (n1) and those with pou4 expression (n2) (13,14)."

      "The class n1 family also includes putatively non-neural secretory cell types ("s"), which are enriched in genes associated with digestion and extracellular matrix production (Data S1.10). These data suggest a close relationship between neurons and gland cells, like what has been suggested in other cnidarians (13,27)."

      "Thus, similar to that described for the anthozoan Nematostella vectensis (13,14), Class 1 neurons and related secretory cells comprise the predominant type of neuroglandular cells in the polyp stage. Further, these are the primary neuroglandular cells within the gastrodermis of the medusa."

      The first functional analysis of NvInsm1+ expressing neurons and secretory cells in Nematostella vectensis was conducted in this study (Tournière, O. et al., 2022), making it essential to cite this work.

      __Answer: __We appreciate the reviewer for drawing this oversight to our attention. This has been corrected in the revised manuscript.

      • To validate the neuronal component of this single-cell data, it is essential to confirm the N1 and N2 populations and demonstrate that they do not overlap. I recommend performing in situ hybridization or antibody staining for Insm1+ and Pou4+ cells (or any other suitable markers for these populations) to show that they are expressed in distinct cells/region in Aurelia.

      __Answer: __We appreciate the reviewers comment, however, there are unfortunately no specific antibodies available for Insm1 or Pou4, or any other n1/n2 specific neuronal marker protein. Moreover, we find in situ hybridization in this system to be very challenging except for highly expressed structural genes. Neurons are particularly difficult, because they are very small cells embedded between many other cell types. We attempted to validate distribution of different neuron populations with colorimetric in situ hybridization, FISH as well as HCR (hybridization chain reaction). However, we were not successful in labelling individual neuron bodies and visualising their cytoplasmic RNA content to distinguish individual cells and therefore individual neuron types. Regardless, to validate at least neuronal cell types, we were able to correlate pan-neuronal tbb-like expression with b-Tubulin antibody staining and of RFamide antibody staining with specific neuronal subpopulations.

      •What is labelled in yellow in Figure 5C? The legend should be updated.

      Answer: Figure 5C does not exist in the current version of the manuscript.

      •Figure 5i, j, and k, are not clear, the paper would benefit with bright field pictures.

      __Answer: __Images were replaced and some bright field photos are incorporated into both new figures.

      •Each figure should connect specific gene expression at a given stage with the corresponding single-cell expression data in a dot plot. For instance, in Figure 6, myofillin-like 2, mhc1, and mhc2 should be accompanied by their respective single-cell expression data at this stage in a dot plot.

      Answer: done!

      • The authors repeatedly refer to the polyp as asexual and the medusa as sexual; however, they do not mention any gonadal cluster nor discuss its absence from their single-cell data.

      __Answer: __We have added the following sentence to the current manuscript to account for this: “Despite its larger size, this animal was still reproductively immature and so no gonadal tissues were collected.”

      •The authors include EdU experiments in Figure S2 but discuss them only briefly in the text. If these experiments provide new insights, they should be elaborated on; otherwise, they could be removed from the manuscript.

      __Answer: __We have removed these data from the manuscript.

      • As this paper is primarily a resource for the cnidarian community, ensuring easy access is crucial for enabling species comparisons. I recommend making the data openly available through a single-cell portal, as done in Juliano et al. (2019).

      __Answer: __We have already released these data on the UCSC cellbrowser platform, as was stated in the manuscript. These data have been updated to reflect the current status of the analyses and is publicly available at www.jellyfish-atlas.cells.ucsc.edu

      Reviewer #2 (Significance (Required)): This well-written paper is a valuable resource for the cnidarian community. A key cell type driving the transition from polyp to free-swimming medusa is the cnidarian striated muscle, which has only been morphologically identified in medusozoan jellyfish. While the study lacks functional analyses, further biological validations, such as in situ hybridizations, are needed to confirm the single-cell data. Nevertheless, it lays a strong foundation for the Aurelia research community to utilize single-cell atlas data in future studies. To maximize its impact, the authors should ensure the data is easily accessible to the broader scientific community.

      We thank this reviewer for their recognition of the importance of this work. We have ensured that the data are available for download through the UCSC cell browser, and all scripts used in the data analysis are available on our github page. We additionally included our new gene models that are associated with the single cell data on the companion UCSC genome browser website, which now hosts the NCBI genome assembly with our gene models.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Link and collaborators presents a well-executed and thorough analysis (statistically significant) of cell types and developmental trajectories in Aurelia coerulea, a cnidarian with a medusa stage. While previous cnidarian cell atlases have focused on embryo-to-polyp development, this study uniquely incorporates adult medusa-stage cells, providing novel insights into cnidarian biology.

      The authors successfully identify a broad range of cell types and precursors in both polyp and medusa stages. By comparing transcriptional profiles, they demonstrate the presence of new cell types, such as neurons, in the medusa. Notably, they provide compelling evidence for the coexistence of both striated and smooth muscle within cnidarians-a topic they have explored in previous work. Their morphological analysis further suggests that striated and smooth muscle forms can exist within single cells, which is particularly intriguing. Overall, the results are convincing.

      A major strength of this study is the extensive number of cells analyzed and the rigorous classification of cell identities based on transcriptional profiles. Unlike many single-cell studies, the authors complement their findings with morphological, immunochemical, and in situ data, strengthening their conclusions. Conducting such an analysis without a fully annotated genome presents a significant challenge, yet the authors navigate this limitation effectively.

      One relative limitation, common to many single-cell studies, is the lack of detailed spatial information on the identified subtypes. While the authors have made efforts in this direction, a higher-resolution atlas that pinpoints these subtypes within the body would enhance the impact of the study. The absence of transgenic tools with cell-type-specific enhancers makes this difficult, but it remains a valuable avenue for future research. Despite this, the study's novelty and quality-particularly its inclusion of medusa-stage data-make it a strong candidate for publication in any journal associated with Review Commons.

      Minor Comments: • The term "terminal cell type markers" may not be the most appropriate for transcription factors that regulate state or specification. A more precise term, such as "state or specification transcriptional regulators," might be preferable.

      __Answer: __This term does not appear in the revised manuscript.

      • The suggestion that cell-type specification is not governed by a random collection of TFs seems self-evident. If not TFs, what alternative regulatory mechanisms (e.g., post-transcriptional regulation, small RNAs) are being implied?

      __Answer: __In the revised manuscript we have removed focus on the TFs.

      • The rationale behind the observation that "'early' cells separate along three principal trajectories (cnido.1, cnido.2, and cnido.3m), then converge upon a second mature transcriptomic phenotype" could be more clearly explained.

      __Answer: __This is a phenomenon that is now well established for cnidarians from the perspective of single cell transcriptomics (Chari et al, 2021: Clytia; Steger et al, 2022, Cole et al 2024, Plessier and Marlow 2026: Nematostella; Cazet et al 2023: Hydra). This phenomena is also described here in terms of the sequence of transcription factors that are activated sequentially in both Aurelia and Nematostella. We have modified the introductory text to better place these observations in context as follows:

      Recently we reported that within the sea anemone Nematostella vectensis, specification of the distinct cnidocyte types is marked by a diverging transcriptomic profile corresponding to the formation of the different capsule types, which then undergo a molecular switch demarcated by up-regulation of GFI1B and converge upon a secondary neural-like expression profile (11). Notably, we find a similar forked trajectory within the cnidocyte population of Aurelia. (Fig. 3A). A cluster of SoxC expressing ‘early’ cells separate along two principal trajectories (cnido.1, cnido.2), which then converge upon a second mature transcriptomic phenotype upon activation of jun/fos (Fig. 3E).

      • The illustrations of the nervous system in the ephyra and rhopalia are intriguing but lack spatial context for different neuronal populations beyond the positioning of class 2 neurons ("alpha- and beta-tubulin cells").

      Answer: We added a better introduction to gain more understanding of the different neuron populations in contrast to various findings of related publications. The text now reads:

      This rhopalia nervous system develops during polyp-medusa metamorphosis and is composed of specialized light- (pigment cup) and gravity- sensing (lithocyte/statocyst) cells, segregated into individual compartments with different developmental origins (12). Rhopalia development involves the gene expression of otx1, pit1 and brn3 in the pigment-cup (10),.... p4/5

      Further, we used findings from previous studies to add a more elaborate description to our results and we finally discuss it, for example:

      The ins-negative populations in both species express pou4 orthologs, also called brn3 (10), that is expressed also within the cnidocyte lineages and thus further supports claims of a close relationship between cnidocytes and insulinoma-negative/pou4-positive n2 neurons (13,14,52). p22

      • Muscle characterization is well-supported by phalloidin staining and gene markers, but is there a specific marker for smooth muscle? Myophilin-like-2 is mentioned, but is it definitive?

      Answer: Yes, there are many, as tabulated in supplemental Data S1.11. For example myophilin-like-2 [calponin] is a specific marker for smooth muscle cells and this is demonstrated via in situ hybridization in fig.6.

      • The finding that ~40% of genes distinguishing smooth and striated muscle lack homologs in other animals is striking. It may be worth investigating their expression patterns via in situ hybridization, particularly for those that differentiate muscle types. The fact that these genes are of unknown affinity does not mean they are uninformative.

      __Answer: __There are a variety of reasons that lead to a lack of orthology information amongst the gene models, including fragmented gene models, inclusion of unidentified lncRNAs, amongst others. However, due to this ambiguity and the lack of identification of these rationals we have removed this observation from the current manuscript. In fact, with the updated mapping tool and current gene annotations this number has fallen to only ~28% of the identified muscle-specific gene models, from a total ~38.7% unannotated gene models in the entire transcriptome. This is similar to other cells types in the dataset (between ~20%-35%), and also similar to the number of unannotated genes in the sea anemone Nematostella vectensis (36.5% overall)

      • The incompleteness of Aurelia genomes is acknowledged as a limitation. However, since the San Diego strain genome appears to be the most complete, is there a reason it was not used in this study? Was it not possible to recover the same strain?

      __Answer: __We have a standing culture in the lab that was used for these collections. While we considered generating a genomic assembly for this laboratory strain, we have concluded that this is not an effective use of resources at this time. We have now updated the reference for mapping however, from a re-analysis of the available Aurelia coerulea isolate AC-2021 genome (NCBI: GCA_039566865.1) annotated with the Gnomon 9.0 automated annotation pipeline, and supplemented with our in-house transcriptome to recover ~5000 additional gene model coordinates on the genome. These are available now via the UCSC genome browser website.

      We further thank this reviewer for the overall positive assessment of our work, and hope that the revised version further strengthens the data analysis and contribution to the community as a whole.

      __ **Referees cross-commenting**__

      Referees, I generally agree with their assessments. Below, I outline my main concerns and suggestions for improvement.

      Figures and Data Presentation

      I concur with Referee 1 that the figures are overcrowded, making it difficult to interpret individual panels. The excessive number of panels within a single figure creates unnecessary complexity. Some of these could be moved to the supplementary materials to improve readability. It seems that the authors aim to present every possible data analysis, but this is not necessary within the main text. As Referee 1 also noted, the key findings should be clearly visible, allowing the reader to follow the story without getting lost in excessive detail.

      __Answer: __We have re-structured most of the figures with this in mind and hope that we have achieved better clarity. Many of the data analyses in the previous versions have been removed if not directly related to the observations highlighted in the current version.

      Additionally, the annotation of clusters remains unclear, a concern also raised by other referees. The manuscript would benefit from a more explicit description of how these clusters were assigned.

      __Answer: __We have expanded our description of how we assigned identities to the nine principal cell type families as follows:

      (pg. 8) The inner epithelia, or gastrodermis, expresses several collagens that is a characteristic of the inner cell layer of anthozoans (39); the outer cell layer houses the ring musculature and is rich in contractile proteins. The striated muscle cluster is also rich in contractile protein and is the only principal cell population absent from the polyp-derived samples (Fig. 2C). The mucin gland expresses mucins, whereas the digestive gland expresses other digestive enzymes, whereas the neural cluster expresses synapsin and other conserved known neural regulators such as ashA. The cnidocytes express mini-collagens and are enriched in pathways targeting the endoplasmic reticulum (40).

      Writing and Discussion

      While I do not have major concerns with the writing, I suggest expanding the discussion, particularly regarding the relationship between muscle cell types and the diversification of paralogs. If the figures are streamlined, the text can also be made more concise, avoiding exhaustive references to every individual data point.

      Clarifications on the Muscle Section

      Several aspects of the muscle analysis require clarification: • The differences between muscle cell types are based on a set of differentially expressed genes, 40% of which (in each set) are of unknown affinities. However, it is surprising that the regulatory genes shared between both muscle profiles are expressed in bilaterian smooth muscles. The manuscript does not address whether bilaterian striated muscles share regulatory genes with the Aurelia striated muscle set. This comparison would be valuable.

      Answer: __With the latest mapping tool the percentage of muscle-specific genes of unknown affinities has dropped to ~28% and we no longer highlight this observation in the manuscript. Regarding the regulatory genes shared with smooth muscles of bilaterians, we feel this may be a misunderstanding. In Fig. 7 we clarify that these are __structural proteins regulating the contraction of the muscle (e.g. Myosin light chain kinase and calponin). With respect to the developmental regulators, e.g. muscle cell type determining transcription factors, we list several in Data S1.3b, S1.4b. A broader phylogenetic and also functional analysis of these transcription factors in different jellyfish species is the focus of another collaborative study and therefore we do not include an in depth discussion of this topic in the current manuscript.__ __

      • The high proportion of unknown genes is concerning. Is this due to issues with the transcriptome assembly, or is it a consequence of insufficient comparative analyses? The statement that "Mapping to this final transcriptome increased confidently mapped genes to 60%" raises questions-does this mean that 40% of differentially expressed genes remain unmapped? This point should be clarified.

      __Answer: __With the latest mapping tool, we now recover a confident alignment for ~80% of the sequences (See supplementary data S2.1). With the previous tool this value was only 60%, which means that 40% of the sequence data could not be used at all to generate the expression matrix. This is a different feature of the data analysis than the identity of the gene models. However, the statement mentioned here no longer appears in the current version of the manuscript.

      • Given the large number of differentially expressed genes with unknown function, could the authors perform in situ hybridization assays on a subset of these genes? This could provide insights into their spatial expression patterns and potential functional relevance.

      Answer: This is an intriguing suggestion, however, given that in situ hybridization for medium and low expressed genes are extremely difficult in this organism, we feel that this is beyond the scope of this study.

      • Both muscle types appear to rely on a similar contractile apparatus but exhibit differential usage of paralogs. This finding is intriguing but is not sufficiently discussed. Are other cell types associated with the differential use of paralogs? Expanding this discussion would add depth to the manuscript.

      Answer: We thank the reviewer for this insightful comment. Indeed, there is circumstantial evidence that differential usage of paralogs is also found among other cell types, e.g. neurons. We indeed discuss the example of a few other genes, e.g. ATOH-like transcription factors and myc. However, the diversity of neuronal populations is very large, which makes the picture quite complex. We are currently working on a phylogenetic framework of cell type families and also between species to address this point, but this requires more theoretical and methodological work. In this paper, we therefore restricted the analyses to the structural proteins of the two types of muscles, which facilitates the assignment of paralogs to either muscle. We point out that this is reminiscent of the differential expression of paralogs in the fast and slow contracting muscle cell types in Nematostella, suggesting that such a subfunctionalization may generally drive also the physiological diversification of muscle cell types in cnidarians (and of animals in general). Future work is aiming to address this on a broader scale, as suggested by the reviewer.

      Neuronal Subtypes

      I reiterate my previous comment regarding neuronal types: • The enrichment of neural subtypes in the medusa stage is an interesting, albeit expected, finding. However, the manuscript lacks details regarding their specific spatial distribution within the body. Providing this information would enhance the biological relevance of the findings.

      Answer: in situ hybridization for neurons is a challenge in all cnidarians, because the small neurons with very thin neurites are embedded and intermingled between many other cell types. In Aurelia, this has proven to be particularly difficult. At the very best, one might see small cell bodies stained, however, it fails to visualize neurites. We also tried HCR (hybridization chain reaction) in combination with antibody staining (b-Tubulin) to get to single cell resolution. However, the results were not conclusive and we therefore refrain from showing them in the paper. As an alternative we connected the findings of previous studies (Nakanishi et al., 2009, 2010) in terms of certain types of neurons located in different compartments of the rhopalia and corresponding marker genes with our single cell data (introduction/discussion). We acknowledge that more work needs to be done, best by generating specific antibodies against neuronal antigens. However, this is beyond the scope of this paper.

      References

      I also agree with Referee 2 that some statements require further substantiation with appropriate references. Strengthening these points with supporting literature would improve the rigor of the manuscript.

      Answer: We added appropriate references at all places indicated, as detailed above.

      Final Remarks

      Overall, while the study presents interesting findings, the manuscript would benefit from a clearer organization of figures, a more explicit explanation of muscle and neural subtype findings, and a deeper discussion on the significance of unknown genes and paralog usage. Addressing these concerns will enhance the clarity and impact of the paper.

      Reviewer #3 (Significance (Required)):

      Overall, this is a significant and well-supported study that advances our understanding of cnidarian cell diversity and muscle evolution. By examining how cell types change across the polyp and medusa stages, this study provides valuable insights not only into cnidarian development but also into broader evolutionary questions regarding the emergence of new body plans and tissue types. As a developmental biologist specializing in invertebrates, I find the results of this work particularly remarkable. It provides valuable insights into the developmental processes occurring in pre-bilaterian animals, shedding light on how cell types emerge and diversify in early-diverging metazoans

      Answer: We thank reviewer 3 for this positive evaluation.

      __Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      __Link et al. have studied cell type diversity in the scyphozoan Aurelia coerulea. More specifically, they compared several stages in the animal's life cycle using single-cell RNA-seq. Many members of the cnidarian clade Medusuzoa (scyphozoans included) have a metagenetic lifecycle that includes a sessile, clonally reproducing polyp and a free swimming, sexually reproducing medusa (jellyfish). The two phases are fundamentally different in their functional morphology, but the cellular basis of this difference has been unknown. The authors generated single cell RNA-seq libraries from eight life-cycle stages of the animal to include polyps, and medusae. Their main finding is that different cell types underlie polyp-medusa transition in this animal. Although expected intuitively, this finding has never been demonstrated experimentally. Moreover, a recent study on a colonial hydrozoan (Salamanca-Diaz et al. 2025) has shown that colony parts, as opposed to different life stages, use largely the same cellular components. Therefore, the current study is of broad interest to developmental and evolutionary biologists. Overall, the experiments and data analyses have been performed to a high standard, the figures are of good quality, and the manuscript is well written. Below are a few minor points to be addressed.

      The Aurelia strain used in the study is somewhat ambiguous (suggested to be A. coerulea). The authors' statements on pp. 24, 25 are somewhat confusing--they first say they got over 90% alignment to the San Diego strain genome assembly but then state (in the 'Transcriptome mapping' section) that they got only 40% of their reads aligned, forcing them to use Trinity de novo transcriptome assembly. Please clarify.

      __Answer: __Alignment to the genome is different from assignment of the alignment to a gene model. Ambiguous alignment cannot be assigned, and missing gene models would not have an assignment. However, we have switched the mapping tool used for this dataset for one that fits both genome sequence alignment AND gene model assignment better than the previously available choices. We now have ~80% of all sequences unambiguously aligned to the genome.

      1. 7--the authors state that some transcription factor families are over/underrepresented as terminal type marker. How do they know which cells are terminally differentiated.

      __Answer: __We have removed our focus on transcription factor families in this work and recognize that the definition of a terminally differentiated cell state from single cell transcriptomics has not been clearly defined.

      The homeobox gene Tlx has been reported to be associated with medusa development, being absent in taxa without medusae (Travert et al. 2023). Is it expressed in the Aurelia medusa (I couldn't find it in the data), and if so, where?

      __Answer: __This is indeed a good point that we were also interested in. However, Tlx is detected ONLY in the ephyra libraries and at very low levels which is why we chose to avoid discussing it as the low detection prevents accurate reporting of the expression and could reflect rather a mapping problem for this gene (mis-annotated 3’ end). As information for this reviewer, the gene model shows some spurious reads specifically in a few neuron subtypes, and outside the ephyra is lowly detected ONLY in the medusa library for medusa neuron n.7 (n2.7m).

      I do not quite understand the authors' arguments for independent striated muscle evolution in cnidarians and bilaterians. Key striated muscle genes (e.g., titin) are present in hydrozoan and anthozoan genomes; furthermore, the expression patterns of Otx is not indicative because its function in medusozoans is unknown. What are the arguments against an alternative scenario in which striated muscles evolved before the cnidarian-bilaterian split, but lost in anthozoans?

      Answer: This is indeed a complex question, which requires a more thorough and targeted comparative analysis. We note that a BLAST hit for Titin can be misleading due to the many domain repeats of this Titin, which are also found in other proteins. To be more prudent, we removed this part from the manuscript. This will be subject of a future, thorough study.

      1. 27, the link https://github.com/technau/AureliaAtlas is broken.

      __Answer: __We appreciate this comment and have ensured that the github archive is publicly available with all relevant scripts associated with all versions of the BioRxiV record.

      p. 24 (limitations of the study section), the authors refer to "cosmopolitan species"; they probably mean "genus".

      __Answer: __We changed to “taxon” and dropped cosmopolitan.

      p. 24-25 on two occasions in the M&M sections, the authors put the abbreviation first and the initials in brackets (ASW and BSA).

      __Answer: __This has been corrected.

      "Metagenic" should be "metagenetic"

      __Answer: __This has been corrected.

      Reviewer #4 (Significance (Required)):

      The study is of broad interest to developmental and evolutionary biologists. It addresses an important question, not dealt with directly in previous studies.

      Answer: We thank reviewer 4 for this positive and encouraging assessment.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Link et al. have studied cell type diversity in the scyphozoan Aurelia coerulea. More specifically, they compared several stages in the animal's life cycle using single-cell RNA-seq. Many members of the cnidarian clade Medusuzoa (scyphozoans included) have a metagenetic lifecycle that includes a sessile, clonally reproducing polyp and a free swimming, sexually reproducing medusa (jellyfish). The two phases are fundamentally different in their functional morphology, but the cellular basis of this difference has been unknown. The authors generated single cell RNA-seq libraries from eight life-cycle stages of the animal to include polyps, and medusae. Their main finding is that different cell types underlie polyp-medusa transition in this animal. Although expected intuitively, this finding has never been demonstrated experimentally. Moreover, a recent study on a colonial hydrozoan (Salamanca-Diaz et al. 2025) has shown that colony parts, as opposed to different life stages, use largely the same cellular components. Therefore, the current study is of broad interest to developmental and evolutionary biologists. Overall, the experiments and data analyses have been performed to a high standard, the figures are of good quality, and the manuscript is well written. Below are a few minor points to be addressed.

      The Aurelia strain used in the study is somewhat ambiguous (suggested to be A. coerulea). The authors' statements on pp. 24, 25 are somewhat confusing--they first say they got over 90% alignment to the San Diego strain genome assembly but then state (in the 'Transcriptome mapping' section) that they got only 40% of their reads aligned, forcing them to use Trinity de novo transcriptome assembly. Please clarify.

      p. 7--the authors state that some transcription factor families are over/underrepresented as terminal type marker. How do they know which cells are terminally differentiated.

      The homeobox gene Tlx has been reported to be associated with medusa development, being absent in taxa without medusae (Travert et al. 2023). Is it expressed in the Aurelia medusa (I couldn't find it in the data), and if so, where?

      I do not quite understand the authors' arguments for independent striated muscle evolution in cnidarians and bilaterians. Key striated muscle genes (e.g., titin) are present in hydrozoan and anthozoan genomes; furthermore, the expression patterns of Otx is not indicative because its function in medusozoans is unknown. What are the arguments against an alternative scenario in which striated muscles evolved before the cnidarian-bilaterian split, but lost in anthozoans?

      p. 27, the link https://github.com/technau/AureliaAtlas is broken.

      p. 24 (limitations of the study section), the authors refer to "cosmopolitan species"; they probably mean "genus".

      p. 24-25 on two occasions in the M&M sections, the authors put the abbreviation first and the initials in brackets (ASW and BSA).

      "Metagenic" should be "metagenetic"

      Significance

      The study is of broad interest to developmental and evolutionary biologists. It addresses an important quastion, not dealt with directly in previous studies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Link and collaborators presents a well-executed and thorough analysis (statistically significant) of cell types and developmental trajectories in Aurelia coerulea, a cnidarian with a medusa stage. While previous cnidarian cell atlases have focused on embryo-to-polyp development, this study uniquely incorporates adult medusa-stage cells, providing novel insights into cnidarian biology.

      The authors successfully identify a broad range of cell types and precursors in both polyp and medusa stages. By comparing transcriptional profiles, they demonstrate the presence of new cell types, such as neurons, in the medusa. Notably, they provide compelling evidence for the coexistence of both striated and smooth muscle within cnidarians-a topic they have explored in previous work. Their morphological analysis further suggests that striated and smooth muscle forms can exist within single cells, which is particularly intriguing. Overall, the results are convincing.

      A major strength of this study is the extensive number of cells analyzed and the rigorous classification of cell identities based on transcriptional profiles. Unlike many single-cell studies, the authors complement their findings with morphological, immunochemical, and in situ data, strengthening their conclusions. Conducting such an analysis without a fully annotated genome presents a significant challenge, yet the authors navigate this limitation effectively.

      One relative limitation, common to many single-cell studies, is the lack of detailed spatial information on the identified subtypes. While the authors have made efforts in this direction, a higher-resolution atlas that pinpoints these subtypes within the body would enhance the impact of the study. The absence of transgenic tools with cell-type-specific enhancers makes this difficult, but it remains a valuable avenue for future research.

      Despite this, the study's novelty and quality-particularly its inclusion of medusa-stage data-make it a strong candidate for publication in any journal associated with Review Commons.

      Minor Comments:

      • The term "terminal cell type markers" may not be the most appropriate for transcription factors that regulate state or specification. A more precise term, such as "state or specification transcriptional regulators," might be preferable.
      • The suggestion that cell-type specification is not governed by a random collection of TFs seems self-evident. If not TFs, what alternative regulatory mechanisms (e.g., post-transcriptional regulation, small RNAs) are being implied?
      • The rationale behind the observation that "'early' cells separate along three principal trajectories (cnido.1, cnido.2, and cnido.3m), then converge upon a second mature transcriptomic phenotype" could be more clearly explained.
      • The illustrations of the nervous system in the ephyra and rhopalia are intriguing but lack spatial context for different neuronal populations beyond the positioning of class 2 neurons ("alpha- and beta-tubulin cells").
      • Muscle characterization is well-supported by phalloidin staining and gene markers, but is there a specific marker for smooth muscle? Myophilin-like-2 is mentioned, but is it definitive?
      • The finding that ~40% of genes distinguishing smooth and striated muscle lack homologs in other animals is striking. It may be worth investigating their expression patterns via in situ hybridization, particularly for those that differentiate muscle types. The fact that these genes are of unknown affinity does not mean they are uninformative.
      • The incompleteness of Aurelia genomes is acknowledged as a limitation. However, since the San Diego strain genome appears to be the most complete, is there a reason it was not used in this study? Was it not possible to recover the same strain?

      Referees cross-commenting

      Referees, I generally agree with their assessments. Below, I outline my main concerns and suggestions for improvement.

      Figures and Data Presentation

      I concur with Referee 1 that the figures are overcrowded, making it difficult to interpret individual panels. The excessive number of panels within a single figure creates unnecessary complexity. Some of these could be moved to the supplementary materials to improve readability. It seems that the authors aim to present every possible data analysis, but this is not necessary within the main text. As Referee 1 also noted, the key findings should be clearly visible, allowing the reader to follow the story without getting lost in excessive detail.

      Additionally, the annotation of clusters remains unclear, a concern also raised by other referees. The manuscript would benefit from a more explicit description of how these clusters were assigned.

      Writing and Discussion

      While I do not have major concerns with the writing, I suggest expanding the discussion, particularly regarding the relationship between muscle cell types and the diversification of paralogs. If the figures are streamlined, the text can also be made more concise, avoiding exhaustive references to every individual data point.

      Clarifications on the Muscle Section

      Several aspects of the muscle analysis require clarification:

      • The differences between muscle cell types are based on a set of differentially expressed genes, 40% of which (in each set) are of unknown affinities. However, it is surprising that the regulatory genes shared between both muscle profiles are expressed in bilaterian smooth muscles. The manuscript does not address whether bilaterian striated muscles share regulatory genes with the Aurelia striated muscle set. This comparison would be valuable.
      • The high proportion of unknown genes is concerning. Is this due to issues with the transcriptome assembly, or is it a consequence of insufficient comparative analyses? The statement that "Mapping to this final transcriptome increased confidently mapped genes to 60%" raises questions-does this mean that 40% of differentially expressed genes remain unmapped? This point should be clarified.
      • Given the large number of differentially expressed genes with unknown function, could the authors perform in situ hybridization assays on a subset of these genes? This could provide insights into their spatial expression patterns and potential functional relevance.
      • Both muscle types appear to rely on a similar contractile apparatus but exhibit differential usage of paralogs. This finding is intriguing but is not sufficiently discussed. Are other cell types associated with the differential use of paralogs? Expanding this discussion would add depth to the manuscript.

      Neuronal Subtypes

      I reiterate my previous comment regarding neuronal types:

      • The enrichment of neural subtypes in the medusa stage is an interesting, albeit expected, finding. However, the manuscript lacks details regarding their specific spatial distribution within the body. Providing this information would enhance the biological relevance of the findings.

      References

      I also agree with Referee 2 that some statements require further substantiation with appropriate references. Strengthening these points with supporting literature would improve the rigor of the manuscript.

      Final Remarks

      Overall, while the study presents interesting findings, the manuscript would benefit from a clearer organization of figures, a more explicit explanation of muscle and neural subtype findings, and a deeper discussion on the significance of unknown genes and paralog usage. Addressing these concerns will enhance the clarity and impact of the paper.

      Significance

      Overall, this is a significant and well-supported study that advances our understanding of cnidarian cell diversity and muscle evolution.

      By examining how cell types change across the polyp and medusa stages, this study provides valuable insights not only into cnidarian development but also into broader evolutionary questions regarding the emergence of new body plans and tissue types.

      As a developmental biologist specializing in invertebrates, I find the results of this work particularly remarkable. It provides valuable insights into the developmental processes occurring in pre-bilaterian animals, shedding light on how cell types emerge and diversify in early-diverging metazoans

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper is well-written and serves as a valuable resource not only for the cnidarian community but also for researchers studying more broadly cell type identity and evolution. A key cell type enabling the transition from polyp to free-swimming medusa is the cnidarian striated muscle, which has only been morphologically identified in medusozoan jellyfish. While this study does not include functional analyses, it lays the foundation for the Aurelia research community to leverage single-cell atlas data for future investigations.

      Key experiments supporting the paper's main conclusions are missing :

      • At the beginning of the Results section, the authors mention identifying a previously undescribed developmental stage, which they name "clover-polyp" However, they do not later discuss whether this newly identified stage has a distinct gene expression signature. This point should be addressed in the paper or removed.
      • A key reference is missing in the following sentences :

      "The anthozoan Nematostella vectensis has two principal neural sub-families that have been described that correspond to those with insulinoma expression (n1) and those with pou4 expression (n2) (13,14)."

      "The class n1 family also includes putatively non-neural secretory cell types ("s"), which are enriched in genes associated with digestion and extracellular matrix production (Data S1.10). These data suggest a close relationship between neurons and gland cells, like what has been suggested in other cnidarians (13,27)."

      "Thus, similar to that described for the anthozoan Nematostella vectensis (13,14), Class 1 neurons and related secretory cells comprise the predominant type of neuroglandular cells in the polyp stage. Further, these are the primary neuroglandular cells within the gastrodermis of the medusa."

      The first functional analysis of NvInsm1+ expressing neurons and secretory cells in Nematostella vectensis was conducted in this study (Tournière, O. et al., 2022), making it essential to cite this work. - To validate the neuronal component of this single-cell data, it is essential to confirm the N1 and N2 populations and demonstrate that they do not overlap. I recommend performing in situ hybridization or antibody staining for Insm1+ and Pou4+ cells (or any other suitable markers for these populations) to show that they are expressed in distinct cells/region in Aurelia. - What is labelled in yellow in Figure 5C? The legend should be updated. - Figure 5i, j, and k, are not clear, the paper would benefit with bright field pictures. - Each figure should connect specific gene expression at a given stage with the corresponding single-cell expression data in a dot plot. For instance, in Figure 6, myofillin-like 2, mhc1, and mhc2 should be accompanied by their respective single-cell expression data at this stage in a dot plot. - The authors repeatedly refer to the polyp as asexual and the medusa as sexual; however, they do not mention any gonadal cluster nor discuss its absence from their single-cell data. - The authors include EdU experiments in Figure S2 but discuss them only briefly in the text. If these experiments provide new insights, they should be elaborated on; otherwise, they could be removed from the manuscript. - As this paper is primarily a resource for the cnidarian community, ensuring easy access is crucial for enabling species comparisons. I recommend making the data openly available through a single-cell portal, as done in Juliano et al. (2019).

      Significance

      This well-written paper is a valuable resource for the cnidarian community. A key cell type driving the transition from polyp to free-swimming medusa is the cnidarian striated muscle, which has only been morphologically identified in medusozoan jellyfish. While the study lacks functional analyses, further biological validations, such as in situ hybridizations, are needed to confirm the single-cell data. Nevertheless, it lays a strong foundation for the Aurelia research community to utilize single-cell atlas data in future studies. To maximize its impact, the authors should ensure the data is easily accessible to the broader scientific community.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Overall, this study adds a large amount of data for the scyphozoan Aurelia coerulea by producing several single-cell RNA sequencing libraries that cover the transition from polyp to medusa. The study provides a modern view of cell type diversity and cell-specific transcriptome changes during this period of extreme morphological change in this particular cnidarian lineage, which is understudied. Certain unique cell subtypes, including neural cell subtypes and muscle cell subtypes which are specific to different life stages are discussed in detail providing some new insights.

      My overall assessment is that the manuscript has good potential to be impactful, but in its current form it is somewhat clunky and overly complex to read, the figures were too crowded and difficult to comprehend, and the authors did not provide enough context regarding the current state of knowledge and what this study adds to it. In particular, Figure 1 and the section about striated and smooth muscles sharing partial transcriptomic profiles need the most work. The results were presented in the context of the anthozoan Nematostella but this should be broadened further to include other cnidarian single-cell studies, such as those from Hydra and Clytia which are both medusozoans like Aurelia. The writing throughout could be streamlined and simplified to better highlight the major findings as described in the abstract of the paper. Several figures were not well presented or clear and could be improved or decluttered to better communicate and support important results. In addition, some methods were totally missing, and I was unable to access the github repository associated with the paper which should detail all analyses described in the paper. In its current form, reproducibility of analyses would be quite limited. I did greatly appreciate the inclusion of the data on the UCSC Cell Browser, which allows anyone to access the single cell data matrix for visual exploration.

      Major comments:

      The Introduction section was very short - only three paragraphs. I feel that this section could be expanded to give more context about Aurelia as a research organism, and the current resources available. This includes genomic and transcriptomic resources particularly those focused on the transition between life cycle stages (polyp to medusa). Any other relevant background on cell type diversity or if there is anything known about the molecular profile of specific cell types found in different life stages should also be included here. Do marker genes already exist for some of the important cell types discussed in the manuscript? It would be better to present the current state of knowledge, and context for why this study was done, how it builds upon current knowledge, and what it adds to our current understanding so that the study is properly framed from the beginning.

      In the Results section, I find the sentence on p. 4, "Further, ~70% of these gene models do not have readily identifiable orthologs and thus represent putative orphan genes" to be rather confusing. What analysis was performed to determine this percentage, and which set of organisms were compared? Doesn't this percentage seem rather high for a cnidarian? Or is this referring to orthologs outside of cnidaria? Please comment further on how this percentage was determined and possible explanations for it being this high. Right now, it just feels tacked on to this paragraph with no context or further explanation which leads to the confusion.

      Figure 1. There are many issues with this figure that encompass how I felt generally about the figures of the paper. The figure should ideally take up the entire width of the page rather than squishing some text next to the figure.

      Figure 1A: The colors of the different developmental stages from which tissue was samples (e.g. polyp1, polyp2, polyp.clover) do not seem to match between legend and figure. For example, the "polyp.clover" stage is circled in blue in the schematic, but given a green dot in the legend. The "medusa.manubrium" is circled in orange in the schematic, but given a purple dot in the legend. Suggest making the colors match between legend and schematics.

      Figure 1E: In Panel E, the labels showing that the top graph is "polyp" and the bottom graph is "medusa" are much too small. Increase the font size of the labels. The font size for the GO terms themselves are also too small.

      Figure 1F: The bulk of this study centers around the single-cell RNA sequencing data and resulting analyses from these data. As such, I would expect the cellular atlas resulting from these data to be similarly highlighted. In Figure 1F, the annotated cell atlas as presented is much too small, making it impossible to even add the labels for the different clusters directly on the UMAP. Suggest increasing the size substantially to at least half of the page width, so that it is possible to do so.

      • There should also be a complimentary figure in the supplement that shows all of the individual clusters, each in different colors and clearly annotated with labels, rather than just showing multiple clusters that were combined into the major cell types. There is an example of this in the Clytia single cell paper (see Chari et al. 2021 Figure 2A vs Fig S9).
      • The graph on the right of this panel showing the "Distribution of cell types in time and space" is overly complicated with all of the colors and the meaning is quite lost as it is quite difficult to interpret at this very small size. Suggest removing and possibly showing as a supplemental figure so that it's meaning is easier to assess.
      • In addition, striated muscles are marked on the overall UMAP; however, it is not noted until later that the smooth muscles are part of the "outer epidermis" cluster. Suggest altering the legend or the text of the figure itself to show where the smooth muscles are thought to be in the overall UMAP, especially since they are specifically discussed in depth later in the manuscript. Exactly which "part" of the outer epidermis cluster includes the smooth muscle cells?

      Figure 1G: Panel G, for example, is not useful in conveying its point as the text labels are too tiny and the figure is overly complex to be squished into a panel of this figure. Suggest removing and making 1G a supplemental figure by itself or perhaps together with 1C (as they are linked) where it is more legible. The figure legend text for Fig 1G is also confusing as it refers to "scyphozoa" in (C) but there is no "scyphozoa" in 1C, only "medusa".

      Text, p. 6: The explanation for how the clusters were annotated in Fig 1 and Fig 2 is much too vague. The text states, 'We identified 9 broadly defined cell populations, for which we assign identities by assessing up-regulated gene lists (Data S1.3)." What does this mean? How exactly were the up-regulated gene lists assessed? This needs to be clarified further. What genes were used to label these clusters or groups as particular cell types? How does the annotation relate to Supplemental Tables S1.3 and S1.3b? Does the previous literature need to be cited to support these annotations based on specific genes? Suggest doing a better job overall and providing more detail and context explaining how the single cell clusters were annotated.

      Text, starting on p14: "Striated and smooth muscles share partial transcriptomic profiles." This section is highly confusing and could do with some simplification in both text and figures.

      • The genes for which expression is shown in Fig. 5, 6 and 7 are not properly introduced or given nearly enough context in the text. For example, the text states, "To investigate the dynamics of muscle formation, we further compared phalloidin staining of muscle fields with in situ hybridization detection of specific cluster marker expression in polyps (Fig. 5), strobila (Fig. 6), and ephyra (Fig.7)." However, it is not until the legend of Figure 7 and also much later in the text (in the Discussion, p23) that it is noted what types of muscles each of the genes used in ISH actually mark ("While a small set of genes are shared across the two muscle phenotypes (e.g. stmyhc1 and mrlc2), others are more specific to either phenotype (eg. stmyhc5 in striated muscle; myophilin-like-2 in smooth muscle) (Fig.8A), which were verified by in situ hybridization (Figs.5,6,7)". This needs to be rewritten and improved for flow and clarity purposes.
      • Suggest that the authors show an overall UMAP of smooth and striated muscle (perhaps the smooth muscle subtypes are part of the large 'outer epidermis' cluster; see the comment for Figure 5B above), and then include featureplots that show the expression of each of the genes used in ISH in these clusters. This might make it clearer as to what type of muscle the genes should be highlighting within each developmental stage. It might look something similar to what is shown in Figure 7P (although it is unclear how the featureplots shown in this figure relate to the UMAP shown in Figure 5B). In addition, the featureplots in Figure 7P only show 3 out of the 4 genes used in ISH which is not helpful. Featureplots should be clearly shown for all genes discussed. This is essential to linking the pattern in the single-cell data to the expression data and is the minimum required to provide clear understanding.
      • The text reads, "To investigate the dynamics of muscle formation, we further compared phalloidin staining of muscle fields with in situ hybridization detection of specific cluster marker expression in polyps (Fig. 5), strobila (Fig. 6), and ephyra (Fig.7)." However, Figure 6 also contains images of ephyra (Fig6. P-S). Suggest that those panels could be included in Figure 7.
      • There are parts of this section text where reference to the Figures is complicated and not easy for the reader to follow. I got particularly confused in trying to follow this part of the manuscript. For example, a sentence on p15 reads, "mrlc2 and stmyhc1 reads are detected in both muscle types (Fig. 7pFig. 5M, Fig 6C,E,G-P, Fig. 7J-L,N-P), and ISH indicates that the expression is localised to the fields of striated muscles in ephyrae (Fig.7J,K,N), as well as the smooth muscle populations in polyps including longitudinal tentacle muscles, radial muscles of oral disc and retractor muscles of the body column (Fig. 5M, Fig.6H,I,L,M), and the muscles of the manubrium in the meta-ephyra (Fig. 7L,O)." It is quite difficult to keep jumping between Figures and panels to look at this. A better organization of the Figures and much clearer text that doesn't jump around could go a long way to making it easier to follow.

      Discussion

      • The authors do try to put their results into context with the two Aurelia genome papers (Gold et al. 2018, and Khalturin et al. 2019) and two additional bulk transcriptome studies (Fuchs et al. 2014, Brekhman et al. 2015), but not until the first part of the Discussion. In principle, this would be fine. However, in practice, their discussion of these studies is somewhat vague and generalized and did not really provide a clear review or analysis of how adding in cell-type specific data is helping our understanding. The argument about how their results fit with previous findings was confusing and unclear. They start by discussing "genome usage" but then switch to talking about cell type diversity across life stages. The connections between "genome usage", "gene representation", and cell types was not easy to follow. Suggest rewriting this section to clearly discuss the findings in this manuscript in the context of previous studies with straightforward and precise language.
      • In the discussion about the neural subtypes, comparisons are only made to Nematostella where there are also two major neural classes. It would be even better to include discussion of single-cell data related to neurons in other cnidarians, such as Hydra, where there is detailed discussion of neuron subtypes in both a published manuscript (Siebert et al. 2019, Science) and a preprint (Primack et al. 2023, biorxiv) and Clytia (Chari et al. 2021, Science Advances). I do see that Clytia and Podocoryna are mentioned in the next section of the Discussion, specifically related to the Otx gene.
      • The section about muscle subtypes in the Discussion would need to be rewritten in accordance to changes suggested above for the Results for this section.

      Materials and Methods

      • In the section "Comparison with Nematostella" the authors discuss running OMA to generate the set of identified 1:1 orthologs but never go on to mention how many orthologs were identified. Please report this number so it is clear whether this is a small or large subset of the total analyzed. In a recent study of the Hydra AEP strain (Cazet et al. 2023 Genome Research), a similar analysis was done between Hydra and Clytia and they found 5979 genes with 1:1 orthologs between the two species. There should also be a supplemental datasheet that provides a list of these orthologs (See Supplemental Data S17 provided in Cazet et al. 2023 as an example). I am curious to know how many 1:1 orthologs were found between Aurelia and Nematostella. I would expect there to be a smaller overall number than between Hydra and Clytia due to the larger phylogenetic distance between these two taxa. I also strongly suggest that the Cazet et al. 2023 paper should be referenced, as it was the first time an attempt to compare single-cell datasets between two cnidarian species was done. The current manuscript took an alternative approach to comparing Aurelia to Nematostella, so it would be good to acknowledge this and justify the methods used in this manuscript compared to those used in Cazet et al. 2023.
      • There are missing descriptions of methods throughout the paper. One example is in the section about Transcription Factor families that are over or underrepresented amongst upregulated genes compared to their distribution in the genome - I could not find any description of the methods used to identify these Transcription Factor families in the dataset of Aurelia upregulated genes. How were these families chosen? How were they identified in this dataset?
      • I noticed in the Data and materials availability statement and a few other places in the manuscript, a github repository was mentioned: https://github.com/technau/AureliaAtlas. I tried to access this repository to review what was included, but unfortunately it is not accessible. I found seven repositories within github.com/technau but the AureliaAtlas was not one of them. This repository should include all scripts to generate all figures and other analyses in the paper and should be made available to reviewers to better understand exactly how all analyses were completed. A good example of how this could be done is found in the repository related to Cazet et al. 2023 (https://github.com/cejuliano/brown_hydra_genomes), which is very comprehensive and easy to follow.
      • When I looked through a similar repository https://github.com/technau/CellReports2022/ from the Steger et al. 2022 Cell Reports Nematostella single-cell paper from this same group, I find it to be rather disappointing. They apparently included all code to generate all figures in a single R file that is not easy to follow and not well commented. If this is the same strategy used for this manuscript, I feel that a much stronger effort could be made to make the analyses of this Aurelia manuscript transparent by producing a github that is more like that of https://github.com/cejuliano/brown_hydra_genomes from the Cazet et al. 2023 paper which organizes each type of analysis in a different github subfolder and within each subfolder they include very detailed information and comments explaining each step of each analysis. Doing this would go a long way to making the analyses in this manuscript more transparent and easier to follow and would certainly put some of my concerns to rest.

      Minor comments:

      Figures:

      Figure 2A: In the legend it says "Colour code as in (B) and (C)" but it's really referencing the colors in Figure 1A, correct? It is confusing to have to look back to Figure 1A to understand the colors here.

      Figure 2D: Typo in the word "proteins" in the title of this panel.

      Figure 3F: The placement of the tree and the two featureplots for myc3 in Nematostella and Aurelia is confusing. Suggest moving the featureplot for Aurelia myc3 so that it is beside Nematostella (to the right of the tree) or move the featureplot for Nematostella myc3 so that it is beside the Aurelia featureplot (to the left of the tree).

      Figure 4B: The description of this panel reads, "Distribution-histogram across all samples, medusa-specific cell clusters are highlighted with black outline.", however as a reader, the black outline is not very clear. Suggest making it bolder. In addition, this black outline is a little confusing - it should mark the medusa-specific cell clusters; however, the black outline appears in cell clusters in strobila and ephyra?

      Figure 5B: It is unclear from where this reference UMAP was derived. Does it come from the overall UMAP, showing the 'outer epidermis' cluster only, with the putative smooth muscle cells in red? Or is it the 'outer epidermis' cluster plus the striated muscle cluster? Suggest making this clearer (see below for larger edits to this section of the manuscript).

      Figure 5K/L/M: It is unclear which parts of the polyp in K is used for the images shown in L or M. Both come from the large red box, but it is unclear from which part L and M were made. In addition, the subtraction of the background from the image (to make it look white) is distracting and makes the image itself look artificial.

      Figure 6C, G-S:

      • Not sure what the blue boxes around these panels are meant to highlight?
      • Also not sure what the image in the left of panel C is. Perhaps an oral view of the strobila? The legend or panel itself should mention this.
      • Again, subtraction of the background from the image (to make it look white) in panels C, D and E is distracting and makes the image itself look artificial.

      Figure 6J, M, N, O:

      • For someone not accustomed to looking at images of strobilating polyps, it is unclear what part and what orientation these images are taken of. Suggest including some of these details in the figure legend at least. Fig 6O actually looks like an ephyra, but is annotated as an "advanced strobila"?

      Figure 7H:

      • Not sure what the white lines in this panel are meant to indicate?

      Results:

      p5 - In this sentence, "Because these four pouches look like a cloverleaf from above, we call this stage the "clover-polyp", suggest changing "clover-polyp" to match the Figure 1A (where it is written as polyp.clover), or change the text in the Figure to match the text in the manuscript.

      p8 - In this sentence, "the bZIP protein family are over-represented as terminal cell type markers, while the number of zinc-finger proteins of the N2C2 class are under-represented", the "N2C2" class the authors refer to is not clear. Is there a typo here? In the figure to which this sentence refers (Figure 2D), the proteins referenced are "zf-H2C2" or "zf-C2H2".

      p9 - Typo - should be "medusozoans" rather than "medusazoans".

      p11+ - Section titled, "Aurelia neural complement reveals two neural classes with similarities to anthozoan neurons"

      • I found the classification of N1 and N2 to be confusing, since initially they are described as neural clusters, however N1 in particular is shown to consist of primarily secretory, non-neural cell types. For example, when looking at Figure 4A and B, it is evident that N1 contains only a relatively small number of neural cell-types (in shades of orange), while most of the cells are other secretory, but non-neural cell types (in shades of brown). Not sure if the authors should alter the title to reflect this? For example, instead of 'neural' classes, they could be called 'neuro-secretory' or 'mixed neural and secretory classes'?

      p11 - Text reads, "Class 1 neurons in the medusa are also most prevalent within the gastrodermis and manubrium, and includes one subtype that first appears in the strobila and is found in all medusa tissue samples ("n1.3.medusa"; lower black box Fig. 4F).", however there is no "lower black box" in Figure 4F apparent.

      p13 - The text reads, "We find that class 2 neurons all express elevated levels of specific alpha- and beta- tubulins (TBA1-like3 and TBB-like-1; Fig. 4D).". Make the capitalization of your gene names (TBA1-like3, etc) consistent between text and figure throughout (in Fig. 4D the gene names are lower case).

      p14 - In the first paragraph of this page, Fig. 4C is referenced twice, however both times the referencing sentence does not match this panel (most likely the authors meant to reference 4E, F or G).

      p14 - The final sentence of this upper paragraph, "Specific tubulin-paralog expression within the class n2 neurons suggest that this is the portion of the nervous system labelled by the β-Tubulin antibody." is confusing. Do you mean that the b-tubulin antibody is most likely labelling the product of the tbb-like-1 gene that is shown in the featureplot in Fig 4D? Suggest rewriting this sentence for clarity.

      p14 - on this page and others in the manuscript, there are instances of the word "Aurelia" not being italicized.

      p14 - In this sentence, "In the sea anemone Nematostella, anemone-specific gene duplications of members of the PaTH (Paraxis, Twist Hand-related) bHLH family of protein coding genes was driving the diversification of muscle cell types (29)." the "was driving" part of the sentence is grammatically clunky. Suggest rewording slightly. (e.g. "...protein coding genes drive the diversification of muscle cell type").

      -Myophilin-like2 in the text of the manuscript is written as myofilin-like2 in the figure panels (e.g. Fig 5L, Fig. 6D). Make consistent between text and figures.

      p15 - on this page and several instances thereafter, "in situ" is not italicized as it should be.

      p19 - In the line, "Taken all together these data suggest that the contractile apparatus in the Scyphozoa, using here Aurelia as a proxy, is similar to the bilaterian smooth muscle contractile complex (Fig. 8C)." this should really reference Fig. 8 B-C

      Significance

      General assessment:

      I believe this manuscript adds a significant amount of useful data and provides some novel insights into scyphozoan cell types across an important life history transition from polyp to medusa in Aurelia. Adding the dataset to the USCS Cell Browser is a strength. I think there is the potential to make this an impactful paper but in its current form, it is pretty messy, and not clearly presented, and lacks some transparency. The greatest weaknesses lie in not framing the work adequately or putting it into enough context with previous work and also not relating it to other medusozoans; in the Figures which are overly crowded, and confusing rather than being clear and supporting the results; and in the lack of explanation for some methods like how cell clusters were annotated, how transcription factor families were determined; and the lack of access to the github data repository, which raises questions of reproducibility. It will take a good amount of restructuring figures and reframing to make the study clear and impactful and the methods and analyses reproducible.

      Advance: If the weaknesses are addressed adequately, this study does contribute new insights in the area of further understanding changes across an important scyphozoan life cycle transition in terms of diversity of cell types and their cell-type transcriptomes, opening up further questions which can now be addressed.

      Audience: The broader cnidarian community will be interested in this study. People studying cell type evolution and cell type novelty across the tree of life will also be interested. Anyone looking for examples of how to use modern approaches to understanding life cycle changes in animals will be interested.

      My expertise is in cnidarian cellular and molecular biology and evolution including working with model cnidarian research organisms and employing techniques and approaches similar to those used in this study.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the Reviewers

      We thank three anonymous Reviewers for their careful examination of our manuscript. Below, we provide a point-by-point response.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      1. EVIDENCE, REPRODUCIBILITY AND CLARITY Summary

      Hubbert and colleagues describe ExTaSy, a CRISPR-Cas9-based platform for the endogenous tagging of proteins in Drosophila melanogaster. The system combines several established molecular tools into a single-vector framework: homology-directed repair (HDR) for the insertion of a 3XHA tag at the endogenous locus, piggyBac transposase-mediated near-scarless removal of a transgenic selection marker, and φC31 integrase-mediated recombination-mediated cassette exchange (RMCE) for subsequent tag swapping. The authors demonstrate the system across a set of 65 genomic loci and provide a bioinformatic pipeline to automate guide RNA and homology arm design.

      Major Comments

      1. Validation of knock-in lines is inadequate and does not reflect current standards in the field. The authors state that correct insertions were confirmed using "two PCRs per inserted fragment done with primers binding to the 5' and 3' ends of the inserted DNA and corresponding gene-specific validation primers." This strategy is well known to produce false positives, as it cannot distinguish correctly targeted single-copy integrants from concatemeric insertions at the target locus (e.g. Skryabin et al., 2020). The current standard for validating CRISPR-mediated knock-ins requires PCR amplification using primers that anneal outside the homology arms and span the entire inserted cassette. These reactions must be performed under conditions that minimise the formation of PCR chimeras, specifically low cycle numbers and use of a high-processivity polymerase. The authors should either provide data from such experiments for their characterised lines, or clearly acknowledge this limitation and qualify their efficiency estimates accordingly (see related point 2 below).

      __Response: __We originally opted for using primers that span a fragment from the inserted DNA into the genomic locus for ease of amplification, which is currently standard in the field (e.g., Kanca et al. 2022). We usually run these PCRs in a heterozygous background (before homozygous stocks are established or because tagged lines remain balanced), and the unmodified locus preferentially amplifies in a whole-fragment PCR. However, we have recently started running whole-fragment PCRs and plan to repeat them for all loci and will report the results in a revised version of the manuscript. We are also revising the manuscript to reflect the necessity (or at least preference) to perform insert-spanning PCRs.

      Reported efficiency metrics do not adequately distinguish correctly targeted integrants from marker-positive flies.

      A related concern is that many of the efficiency parameters reported in the manuscript appear to be based solely on the detection of the marker cassette. The 63.1% overall success rate, for example, seemingly reflects the recovery of DsRed-positive flies rather than of sequence validated, single-copy, on-target integrants. These are fundamentally different quantities, with only the latter being of practical value for the users of the described technique. The authors should either provide data that properly accounts for correct integration, or more carefully define what each reported metric represents and explicitly acknowledge the limitations of using marker presence as a proxy for successful knock-in.

      __Response: __The reviewer is correct that the numbers we report are DsRed-positive flies. However, most have been confirmed with end-of-fragment/locus spanning PCRs, so are on-target (although not necessarily single-copy; see comment #1). While we cannot categorically exclude off-target insertions, we have not observed any cases where the DsRed segregates independently of the targeted chromosome, which at least makes off-target insertions on other chromosomes highly unlikely. We will clarify in the text that the 63.1 % success rate relates to DsRed marker expression and insertion site-spanning PCR and acknowledge the limitations as suggested by the reviewer.

      The characterisation of tag exchange requires expansion or more careful framing of its scope.

      The possibility of exchanging tags through fly crosses rather than repeated microinjections is, in the view of this reviewer, the most practically useful feature of ExTaSy and the aspect most likely to drive community adoption. It is therefore important that this feature is characterised with sufficient rigour to allow prospective users to assess its reliability. In the current manuscript, tag exchange has been demonstrated at only five loci using a single replacement tag (sfGFP). The dataset includes one outright failure (the Met C-terminus) and one instance of an unexpected 9 bp insertion at the recombination site, leaving the success rates and failure modes across a broader range of loci and tags uncharacterised. The authors should either expand the tag exchange experiments to cover a more representative set of conditions, or frame the current data explicitly as a proof of concept and limit their conclusions about the practical utility of tag exchange accordingly. In either case, the value of this work to the community would be substantially increased if a collection of donor lines carrying the most commonly used tags for different applications, as the authors themselves enumerate in the Discussion, were generated and deposited at a public stock centre such as the VDRC concurrent with publication. On this note, it is also worth flagging that at present the plasmids described in this study have not yet been deposited at Addgene or the European Plasmid Repository, and that fly lines are available only on request. For a methods paper aimed at community adoption, deposition of reagents in publicly accessible repositories at the time of publication is the expected standard.

      __Response: __We are in the process of increasing the number of fly stocks for which tags have been exchanged and will be able to provide a more rigorous characterization with an updated version of the manuscript. We are also working on additional swap lines (for example T2A-GAL4). Regarding submission of the materials to relevant databases, we are in the process of depositing the plasmids on Addgene. We plan to deposit the swap lines and other toolkit stocks (new hs-Flp, vas-int lines as well as pBac transposase lines) at the VDRC or BDSC. To make the tagged fly lines viable for distribution via the VDRC, we are working to increase their numbers, and we plan to publish them separately as a resource, where we also plan to characterize the expression of more transcription factors and their isoforms in greater detail.

      The Introduction should better reflect the current state of the field, including explicit comparison with MiMIC and CRIMIC.

      The introduction would benefit from a clearer distinction between transgene-based approaches that introduce additional gene copies and true CRISPR-mediated knock-ins at the endogenous locus. As it stands, the discussion of prior methods does not sufficiently acknowledge that CRISPR-based knock-in is already the standard approach in Drosophila, and that the individual techniques employed in ExTaSy are well established. Notably, the MiMIC and CRIMIC systems (Nagarkar-Jaiswal et al., 2015; Li-Kroeger et al., 2018), which also support RMCE-based tag exchange at endogenous loci and for which large collections of lines are already publicly available, are not adequately discussed. These are arguably the closest comparators to ExTaSy, and the authors should explicitly address how their approach differs from and offers advantages over this existing framework, particularly given that MiMIC/CRIMIC insertions can also tag internal sites and thus avoid some of the terminus-specific complications described here.

      __Response: __We will expand the introduction and the discussion to give more reference to other resources for endogenously and exogenously tagged genes in Drosophila and compare ExTaSy in greater detail with other methods, highlighting advantages and disadvantages of each and making clear that RMCE-based tag exchange and marker removal are not novel inventions.

      • *

      Minor Comment

      The labelling of sgRNA target sites in Figure 1 is inaccurate and should be corrected.

      In Figure 1, the sgRNA target sites are annotated with triangles labelled "PAM synth." The presence of a PAM is necessary but not sufficient to define a target site; the label should therefore be changed to "target site" or an equivalent term. Additionally, the Methods section incorrectly expands PAM as "primary adjacent motif"; the correct expansion is "protospacer adjacent motif."

      __Response: __The labelling in Figure 1 will be changed and the PAM abbreviation corrected.

      Could the fly crossing scheme in Figure S3 be simplified?

      In the scheme in Fig. S3 the second step seems to be intended to introduce the hs-Flp and vase-Int transgenes. Would it not be possible to already incorporate the Integrase into the swap fly line when it is made and the hs-Flp into the ExTaSy line, thereby saving one generation?

      __Response: __This would in principle be possible; however, we prefer to keep the lines “clean” in case a tag exchange is not desired, and so this would require an initial crossing step. We therefore prefer the crossing scheme as it is.

      Figure 1F has no call out in the main text.

      __Response: __This will be corrected.

      Line 155: What was the reason for the low survival rate? Is this likely to be indicative of a problem during marker removal, or a stochastic event as not all fly crosses are always productive (bad food, early death of flies, etc.)?

      __Response: __This was a stochastic event. The fly line we used for expression of piggyBac transposase (BDSC_8285) is generally not growing well, and we could only use one eighth of all offspring to ensure correct segregation. We will make this clear in the text.

      Line 160: What is the N number of "all cases"?

      __Response: __This will be changed to “We performed Sanger sequencing for one established line for each of the 17 loci and confirmed clean excision of the piggyBac sites in all cases.”

      Scale bars are missing in Fig. 3g,h.

      __Response: __These will be included.

      • *

      Line 219: The labeling of the panels got mixed up. Panel F does not show an immunostaining.

      __Response: __The labeling will be corrected.

      Line 226 and Fig. 3h: It is unclear what area is shown in the inlay. The overview image highlights three POIs, but none seem to fit the inlay.

      __Response: __The images were indeed misleading as the inlay did not show a magnification of the same focal plane. We will show the inlay together with the overview of the corresponding focal plane as part of Supplementary Figure 5 and will amend the text accordingly.

      Line 233: Why was the transgenic marker not removed? The authors want to highlight the easy and advantage of marker removal, so leaving in the marker is an odd choice.

      __Response: __In this case, we observed that flies become homozygous even with the marker, so we assumed that a marker removal would not be necessary. We are currently performing additional experiments to remove the marker and repeat the staining, which we will submit with a revised version of the manuscript.

      Line 250: Why was only one isoform of hth tagged? Without a rational this seems to be an odd choice, in particular since the authors seem to suggest in the introduction (Line 38) that a disadvantage of previous technologies is the tagging of only selected isoforms.

      __Response: __While expanding the introduction (see comment #4), we will also rephrase it to highlight that current CRISPR-based methods (MiMIC and CRIMIC) are designed to tag all isoforms simultaneously or select isoforms, whereas overexpression constructs are limited to one isoform. In contrast, ExTaSy allows tagging of all isoforms that share a terminus. We will emphasize advantages and disadvantages in the discussion. In the case of hth, three different C-termini are annotated, and we are currently performing experiments to also tag the other termini and co-stain them with Ubx. We will submit the results in a revised version of the manuscript.


      Reviewer #1 (Significance (Required)):

      SIGNIFICANCE

      ExTaSy assembles a set of well-established tools, namely CRISPR-mediated HDR, piggyBac-based marker excision, and φC31-mediated RMCE, into a unified, single-vector framework for endogenous protein tagging in Drosophila. The individual components have all been described and are in routine use in the field; the conceptual advance is therefore limited. Nevertheless, the integration of these features into a streamlined platform with accompanying automated design software represents a practical contribution that is likely to be of genuine utility to the Drosophila community, particularly for laboratories without specialist transgenesis infrastructure.

      The possibility of tag exchange by fly crossing is the most distinctive feature of the system. However, as discussed above, this is currently demonstrated at only five loci with a single replacement tag, which limits the conclusions that can be drawn about its generality. More broadly, ExTaSy employs well-proven strategies throughout, which is a source of reliability but also means that the study does not incorporate more recent developments in the field. For example, approaches based on single-strand annealing, such as the recently described Seed/Harvest system (Aguilar et al., 2024), can achieve entirely scarless marker removal and thus circumvent the TTAA scar left by piggyBac excision, a limitation the authors themselves acknowledge may reduce expression at modified N-terminal loci. Similarly, the current system is restricted to N- and C-terminal tagging. Given that the goal of endogenous tagging is to minimally perturb protein function, and given the now widespread availability of high-quality protein structure predictions for the Drosophila proteome, a modern tagging platform might be expected to use structural modelling to identify optimal insertion sites irrespective of their location. These are not oversights that diminish the practical value of the current work, but highlight that this study does not always operate at the cutting edge of method development in this area. A brief discussion of these more recent developments in the context of ExTaSy's design choices would usefully situate the work within the broader landscape and help readers understand both what the system offers today and where improvements are likely to come from.

      __Responses: __

      • As stated above, we are currently performing experiments to further validate the tag exchange.
      • Regarding the SEED/Harvest system, we have considered this; however, this would leave both flanking attP/attB sites at the genomic locus rather than only the site between the tag and the CDS. Both sites would have to be incorporated into the CDS or they would leave an even bigger scar. Additionally, since SEED/Harvest relies on micro-homology between two tag halves, it would require removal of the transgenesis marker before tagged lines become usable. Our system is advantageous in that C-terminally tagged lines can usually be used immediately. However, we will refer to the paper by Aguilar et al. and discuss how a similar system could be incorporated into ExTaSy.
      • Regarding structure-function predictions, these could be incorporated into the bioinformatic pipeline. It would then be possible to modify ExTaSy to introduce tags internally together with a SEED/Harvest-like modification. We will include this in the discussion.

        Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      Hubbert et al. describes ExTaSy (Exchangeable Tagging System), a method for endogenous protein tagging in fruitflies. The technique attempts to address some limitations of current tagging strategies, such as non-physiological expression from transgenes, disruption of the target gene, and limited usefulness of a single tag type. The basic approach is not novel, rather it effectively incorporates ideas from several previously published methods:

      • Crispr-based release of the HDR donor from the backbone in vivo (Kanca et al., 2019 and 2021).
      • PBac scarless tagging (flycrisprdesign)
      • In vivo RMCE to swap out tags (Nagarkar-Jaiswal et al., 2015) Although not novel, the authors show the completeness and effectiveness of the approach. They were able to tag genes across multiple chromosomes, with knock-in rates comparable to other approaches, and demonstrate tag swapping through RMCE. Overall, this work introduces a versatile and modular platform that combines several previous innovations into a single effective package.

      Major comments

      1.The manuscript would benefit from a more upfront discussion of how ExTaSy relates to existing methods. As currently written, the implies a higher degree of novelty than is warranted, since ExTaSy combine several previously established approaches, including, as already noted. While this is valuable, the authors should more clearly acknowledge in the abstract and introduction that the primary advance is the unification and streamlining of these existing technologies into a single platform, rather than the introduction of fundamentally new components.

      __Response: __While we did cite most of the publications mentioned by the reviewer, we will make clearer that our system combines several previously established Drosophila systems and is not per se a novel invention. We will expand the introduction and discussion to reflect this and cite additional publications.

      • *

      2.Comparison to prior systems. The manuscript should include a direct comparison to existing tagging pipelines. For example: What practical steps are eliminated relative to prior approaches? Does ExTaSy reduce the number of injections or constructs required? How does the workflow differ in terms of time, cost, or technical expertise? This is vaguely addressed in the discussion, but more specific and clear comparisons would improve things for the reader who is trying to decide which method to use. For example, how does this strategy directly compare with the protein trap alleles described in Kanca et al., 2022? This could be done as a supplemental table.

      __Response: __A similar concern has been raised by reviewer #1 (comment #4). We will expand the introduction and the discussion to compare ExTaSy in more detail with other methods, highlighting advantages and disadvantages of each.

      3.Only 4 successful RMCE swaps are presented. This is too few to make a confident conclusion about the efficiency. The authors should do at least 4 more and include negative data.

      __Response: __A similar point has been made by reviewer #1 (comment #3). We are in the process of expanding the number of fly stocks for which tags have been exchanged and will be able to provide a more rigorous characterization with an updated version of the manuscript.

      4.Some discussion of the potential limitations of the linker from the residual att sites is needed.

      __Response: __We will include this in the discussion.

      Minor comments

      1.It would be helpful to include a workflow overview figure summarizing the full pipeline.

      __Response: __We will include such a figure in the supplement.

      2.Line 124: Most genes we tagged at the C-terminus were homozygous viable, indicating limited detrimental effects. Need to include the numbers? What is "most genes."

      __Response: __We will include these numbers in the text.

      3.Briefly explain how the tested genes were selected (e.g., random, representative, biased toward certain classes), as this could affect interpretation of generalizability. If most of the genes are essential for viability, this makes the viability of tagged lines more impressive.

      __Response: __This is an excellent suggestion, and we thank the reviewer for pointing this out. We have mainly tagged genes that are relevant for work in our labs and for collaborators, focusing almost entirely on transcription factor-encoding genes that are largely essential for normal development. We will include a brief discussion of this.

      Reviewer #2 (Significance (Required)):

      Significance

      1.General assessment: This study presents ExTaSy, a practical and well-executed platform for endogenous protein tagging in Drosophila. Its main strength is the integration of multiple existing technologies into a streamlined workflow that enables tagging, marker removal, and tag swapping. The system is clearly functional and broadly applicable. However, the conceptual novelty is limited, and the manuscript should more explicitly frame the work as an engineering advance. Tagging and RMCE efficiencies are moderate.

      2.Advance: ExTaSy represents a technical advance that combines CRISPR HDR tagging, piggyBac scarless editing, and RMCE into a single platform. The biggest improvement is the ability to tag once and flexibly swap tags via crosses, reducing the need for repeated genome engineering. This extends existing methods by improving experimental flexibility.

      3.Audience: This work will primarily interest a specialized audience in Drosophila genetics, CRISPR technologies, and functional genomics, with broader relevance to researchers developing tagging systems in other model organisms.

      4.Field of expertise: CRISPR screening, Drosophila genetics, functional genomics. No limitations on my ability to evaluate.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This methods paper is targeting the long-standing ambition of how to most efficiently tag proteins at the endogenous gene locus in Drosophila. Since the invention of CRISPR-Cas9 many genes have been successfully modified in Drosophila, but the community is still lacking a large collection of tagged proteins under endogenous control made with the same method.

      This manuscript is using a small tag, 3xHA, which supposedly is easier to integrate, and the design allows to then swap the tag with larger fluorescent tags, solely by fly crossing. Then, the dsRed or white markers, allowing identification, can be removed with a biggybac recombinase leaving only a small scar. However, attP/B/R scars do remain. Design and cloning appear straightforward. Overall, this is an interesting strategy.

      However, the manuscript falls short in really describing the resource, apart from the cloning design. A more rigorous analysis of a number of lines should be presented to better judge if the strategy practically works. It is quite disappointing to see that only 2 or 3 genes/proteins were analysed here in a bit more detail. This does not sound like a very straightforward resource that aims to go large scale.

      Major comments:

      1. The important novelty here is not only the design that allows high-throughput cloning but more importantly that the tagged lines are actually correct and functional. To present this better, I suggest to rearrange Figure 1 to show the flow: 65 constructs cloned, 41 "successfully" inserted. Of how many the dsRed marker was removed, of how many expression or function was tested? Hence the reader knows about the current state of the resource. These numbers would be informative to have in the abstract, too.

      __Response: __We will include these numbers in the abstract. Reviewer 2 asked for an overview figure of the workflow, which we will include as a supplementary figure, where we can also include numbers as suggested by this reviewer.

      The 41 tagged gene insertions need at least some basic characterisation to verify that they are at the correct place or make a functional protein. Which genes were chosen? I do not see 41 genes tagged in the table provided. I supposed the N-terminal tags should initially be loss of function. Are the N-term lines lethal when inserted in an essential gene? Again, this could be shown in an overview, instead by a non-quantitative statement in the text.

      __Response: __We have verified the insertion site of the lines with genotyping PCR. We will include a table to show in more detail which genes were tagged at which terminus, and which protein isoforms are captured by the respective tag.

      • *

      How many of the 41 tagged proteins are functional? The authors only provide information on Ubx-3xHA (functional) and Mef2-3xHA (non-functional), which I find weak.

      __Response: __We will include this information in the table mentioned in the above comment.

      Stainings are only shown for 2 proteins, Ubx-GFP and Exd-3xHA. How about the others?

      __Response: __We are currently in the process of using ExTaSy to establish a library of tagged fly lines, which we intend to characterize in more detail and publish separately. For the current manuscript, we prefer to focus on the methodology of the tagging system itself.

      I am not sure about how to calculate the transgenesis rates, but strictly speaking to ones that did not result in an insertion should also be counted for the statistics, I guess.

      __Response: __There is indeed no commonly agreed upon way to calculate these rates, and it is done differently in different publications. We felt that metrics that discriminate between the overall success rate (i.e., all those injections that lead to transgenics) and the success rate within successful injections would be most useful. We will try to make clear in the text where we refer to all attempts and where we exclusively refer to the successful ones.

      Minor comments:

      1. The introduction states that ExTaSy would tag all isoforms of genes. However, I find this an overstatement, as for complex genes tagging at the one place cannot always label all isoforms, see the Hth line generated here (Iso E).

      __Response: __This was indeed badly phrased and we will correct the wording also in response to reviewer #1 comment #14 to reflect that overexpression constructs are limited to a specific isoform, whereas ExTaSy enables simultaneous tagging of all isoforms that share a terminus.

      Why does it matter on which chromosome the target gene is? This can be moved to supplement. I would rather like to know what the genes are.

      __Response: __We presume that the reviewer refers to Figure 1, where we show the success rates for individual chromosomes. We felt that the lower success rate for injections targeting gene on chr3 (which is, as we describe, due to lower survival of the injection line) warranted this separation by chromosome. As stated above, we will include a list of tagged genes as a table.

      **Referees cross-commenting**

      I agree with the 2 other reviewer's points. In particular that the knock-in lines need better verifications. This was also my major point.

      __Response: __As also stated for reviewer #1 comment #1, we have now begun to run whole-fragment PCRs for all loci to investigate this further and will report the results in a revised version of the manuscript.

      Reviewer #3 (Significance (Required)):

      The methodology presented here is per se not really new. The 3xP3-dsRed eye marker is standard, its removal by biggbac transposase has been done before and RMCE to change the tagging cassettes with attP/B is done since many years. The latter has the disadvantage to not be seamless, as one attR site remains, which is translated, the other attR site remains in the 5'- or 3'-UTR, which can have an effect. U6-driven sgRNA expression is also standard.

      __Response: __We will make clearer that our system combines several previously established Drosophila systems and is not per se a novel invention. We will expand the introduction and discussion to reflect this and cite additional publications.

      The design includes the sgRNA and the HDR template cassette in a single vector, which is smart and makes cloning straight forward. Again, the paper would be stronger if the list of all cloned clones would be listed (are 65 all that were clones or all that were injected?

      __Response: __We will include this as a table.

      As the authors do not rigorously test the function of the tagged genes, it is hard to judge how valuable the pipeline is. This can be easily solved by providing more data that support the easy, high-throughput exchange tagging pipeline that produces tagged Drosophila lines that are useful to the community.

      __Response: __As stated above, we plan to publish a more detailed analysis of tagged lines as a separate resource paper. We will state in the manuscript which lines were homozygous viable before and after marker removal, which gives at least an indication of whether the tagged protein is functional.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This methods paper is targeting the long-standing ambition of how to most efficiently tag proteins at the endogenous gene locus in Drosophila. Since the invention of CRISPR-Cas9 many genes have been successfully modified in Drosophila, but the community is still lacking a large collection of tagged proteins under endogenous control made with the same method. This manuscript is using a small tag, 3xHA, which supposedly is easier to integrate, and the design allows to then swap the tag with larger fluorescent tags, solely by fly crossing. Then, the dsRed or white markers, allowing identification, can be removed with a biggybac recombinase leaving only a small scar. However, attP/B/R scars do remain. Design and cloning appear straightforward. Overall, this is an interesting strategy. However, the manuscript falls short in really describing the resource, apart from the cloning design. A more rigorous analysis of a number of lines should be presented to better judge if the strategy practically works. It is quite disappointing to see that only 2 or 3 genes/proteins were analysed here in a bit more detail. This does not sound like a very straightforward resource that aims to go large scale.

      Major comments:

      1. The important novelty here is not only the design that allows high-throughput cloning but more importantly that the tagged lines are actually correct and functional. To present this better, I suggest to rearrange Figure 1 to show the flow: 65 constructs cloned, 41 "successfully" inserted. Of how many the dsRed marker was removed, of how many expression or function was tested? Hence the reader knows about the current state of the resource. These numbers would be informative to have in the abstract, too.
      2. The 41 tagged gene insertions need at least some basic characterisation to verify that they are at the correct place or make a functional protein. Which genes were chosen? I do not see 41 genes tagged in the table provided. I supposed the N-terminal tags should initially be loss of function. Are the N-term lines lethal when inserted in an essential gene? Again, this could be shown in an overview, instead by a non-quantitative statement in the text.
      3. How many of the 41 tagged proteins are functional? The authors only provide information on Ubx-3xHA (functional) and Mef2-3xHA (non-functional), which I find weak.
      4. Stainings are only shown for 2 proteins, Ubx-GFP and Exd-3xHA. How about the others?
      5. I am not sure about how to calculate the transgenesis rates, but strictly speaking to ones that did not result in an insertion should also be counted for the statistics, I guess.

      Minor comments:

      1. The introduction states that ExTaSy would tag all isoforms of genes. However, I find this an overstatement, as for complex genes tagging at the one place cannot always label all isoforms, see the Hth line generated here (Iso E).
      2. Why does it matter on which chromosome the target gene is? This can be moved to supplement. I would rather like to know what the genes are.

      Referees cross-commenting

      I agree with the 2 other reviewer's points. In particular that the knock-in lines need better verifications. This was also my major point.

      Significance

      The methodology presented here is per se not really new. The 3xP3-dsRed eye marker is standard, its removal by biggbac transposase has been done before and RMCE to change the tagging cassettes with attP/B is done since many years. The latter has the disadvantage to not be seamless, as one attR site remains, which is translated, the other attR site remains in the 5'- or 3'-UTR, which can have an effect. U6-driven sgRNA expression is also standard. The design includes the sgRNA and the HDR template cassette in a single vector, which is smart and makes cloning straight forward. Again, the paper would be stronger if the list of all cloned clones would be listed (are 65 all that were clones or all that were injected?

      As the authors do not rigorously test the function of the tagged genes, it is hard to judge how valuable the pipeline is. This can be easily solved by providing more data that support the easy, high-throughput exchange tagging pipeline that produces tagged Drosophila lines that are useful to the community.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Hubbert et al. describes ExTaSy (Exchangeable Tagging System), a method for endogenous protein tagging in fruitflies. The technique attempts to address some limitations of current tagging strategies, such as non-physiological expression from transgenes, disruption of the target gene, and limited usefulness of a single tag type. The basic approach is not novel, rather it effectively incorporates ideas from several previously published methods:

      • Crispr-based release of the HDR donor from the backbone in vivo (Kanca et al., 2019 and 2021).
      • PBac scarless tagging (flycrisprdesign)
      • In vivo RMCE to swap out tags (Nagarkar-Jaiswal et al., 2015) Although not novel, the authors show the completeness and effectiveness of the approach. They were able to tag genes across multiple chromosomes, with knock-in rates comparable to other approaches, and demonstrate tag swapping through RMCE. Overall, this work introduces a versatile and modular platform that combines several previous innovations into a single effective package.

      Major comments

      1.The manuscript would benefit from a more upfront discussion of how ExTaSy relates to existing methods. As currently written, the implies a higher degree of novelty than is warranted, since ExTaSy combine several previously established approaches, including, as already noted. While this is valuable, the authors should more clearly acknowledge in the abstract and introduction that the primary advance is the unification and streamlining of these existing technologies into a single platform, rather than the introduction of fundamentally new components. 2.Comparison to prior systems. The manuscript should include a direct comparison to existing tagging pipelines. For example: What practical steps are eliminated relative to prior approaches? Does ExTaSy reduce the number of injections or constructs required? How does the workflow differ in terms of time, cost, or technical expertise? This is vaguely addressed in the discussion, but more specific and clear comparisons would improve things for the reader who is trying to decide which method to use. For example, how does this strategy directly compare with the protein trap alleles described in Kanca et al., 2022? This could be done as a supplemental table. 3.Only 4 successful RMCE swaps are presented. This is too few to make a confident conclusion about the efficiency. The authors should do at least 4 more and include negative data. 4.Some discussion of the potential limitations of the linker from the residual att sites is needed.

      Minor comments

      1.It would be helpful to include a workflow overview figure summarizing the full pipeline. 2.Line 124: Most genes we tagged at the C-terminus were homozygous viable, indicating limited detrimental effects. Need to include the numbers? What is "most genes." 3.Briefly explain how the tested genes were selected (e.g., random, representative, biased toward certain classes), as this could affect interpretation of generalizability. If most of the genes are essential for viability, this makes the viability of tagged lines more impressive.

      Significance

      1.General assessment: This study presents ExTaSy, a practical and well-executed platform for endogenous protein tagging in Drosophila. Its main strength is the integration of multiple existing technologies into a streamlined workflow that enables tagging, marker removal, and tag swapping. The system is clearly functional and broadly applicable. However, the conceptual novelty is limited, and the manuscript should more explicitly frame the work as an engineering advance. Tagging and RMCE efficiencies are moderate. 2.Advance: ExTaSy represents a technical advance that combines CRISPR HDR tagging, piggyBac scarless editing, and RMCE into a single platform. The biggest improvement is the ability to tag once and flexibly swap tags via crosses, reducing the need for repeated genome engineering. This extends existing methods by improving experimental flexibility. 3.Audience: This work will primarily interest a specialized audience in Drosophila genetics, CRISPR technologies, and functional genomics, with broader relevance to researchers developing tagging systems in other model organisms. 4.Field of expertise: CRISPR screening, Drosophila genetics, functional genomics. No limitations on my ability to evaluate.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Hubbert and colleagues describe ExTaSy, a CRISPR-Cas9-based platform for the endogenous tagging of proteins in Drosophila melanogaster. The system combines several established molecular tools into a single-vector framework: homology-directed repair (HDR) for the insertion of a 3XHA tag at the endogenous locus, piggyBac transposase-mediated near-scarless removal of a transgenic selection marker, and φC31 integrase-mediated recombination-mediated cassette exchange (RMCE) for subsequent tag swapping. The authors demonstrate the system across a set of 65 genomic loci and provide a bioinformatic pipeline to automate guide RNA and homology arm design.

      Major Comments

      1. Validation of knock-in lines is inadequate and does not reflect current standards in the field.

      The authors state that correct insertions were confirmed using "two PCRs per inserted fragment done with primers binding to the 5' and 3' ends of the inserted DNA and corresponding gene-specific validation primers." This strategy is well known to produce false positives, as it cannot distinguish correctly targeted single-copy integrants from concatemeric insertions at the target locus (e.g. Skryabin et al., 2020). The current standard for validating CRISPR-mediated knock-ins requires PCR amplification using primers that anneal outside the homology arms and span the entire inserted cassette. These reactions must be performed under conditions that minimise the formation of PCR chimeras, specifically low cycle numbers and use of a high-processivity polymerase. The authors should either provide data from such experiments for their characterised lines, or clearly acknowledge this limitation and qualify their efficiency estimates accordingly (see related point 2 below). 2. Reported efficiency metrics do not adequately distinguish correctly targeted integrants from marker-positive flies.

      A related concern is that many of the efficiency parameters reported in the manuscript appear to be based solely on the detection of the marker cassette. The 63.1% overall success rate, for example, seemingly reflects the recovery of DsRed-positive flies rather than of sequence validated, single-copy, on-target integrants. These are fundamentally different quantities, with only the latter being of practical value for the users of the described technique. The authors should either provide data that properly accounts for correct integration, or more carefully define what each reported metric represents and explicitly acknowledge the limitations of using marker presence as a proxy for successful knock-in. 3. The characterisation of tag exchange requires expansion or more careful framing of its scope.

      The possibility of exchanging tags through fly crosses rather than repeated microinjections is, in the view of this reviewer, the most practically useful feature of ExTaSy and the aspect most likely to drive community adoption. It is therefore important that this feature is characterised with sufficient rigour to allow prospective users to assess its reliability. In the current manuscript, tag exchange has been demonstrated at only five loci using a single replacement tag (sfGFP). The dataset includes one outright failure (the Met C-terminus) and one instance of an unexpected 9 bp insertion at the recombination site, leaving the success rates and failure modes across a broader range of loci and tags uncharacterised. The authors should either expand the tag exchange experiments to cover a more representative set of conditions, or frame the current data explicitly as a proof of concept and limit their conclusions about the practical utility of tag exchange accordingly. In either case, the value of this work to the community would be substantially increased if a collection of donor lines carrying the most commonly used tags for different applications, as the authors themselves enumerate in the Discussion, were generated and deposited at a public stock centre such as the VDRC concurrent with publication. On this note, it is also worth flagging that at present the plasmids described in this study have not yet been deposited at Addgene or the European Plasmid Repository, and that fly lines are available only on request. For a methods paper aimed at community adoption, deposition of reagents in publicly accessible repositories at the time of publication is the expected standard. 4. The Introduction should better reflect the current state of the field, including explicit comparison with MiMIC and CRIMIC.

      The introduction would benefit from a clearer distinction between transgene-based approaches that introduce additional gene copies and true CRISPR-mediated knock-ins at the endogenous locus. As it stands, the discussion of prior methods does not sufficiently acknowledge that CRISPR-based knock-in is already the standard approach in Drosophila, and that the individual techniques employed in ExTaSy are well established. Notably, the MiMIC and CRIMIC systems (Nagarkar-Jaiswal et al., 2015; Li-Kroeger et al., 2018), which also support RMCE-based tag exchange at endogenous loci and for which large collections of lines are already publicly available, are not adequately discussed. These are arguably the closest comparators to ExTaSy, and the authors should explicitly address how their approach differs from and offers advantages over this existing framework, particularly given that MiMIC/CRIMIC insertions can also tag internal sites and thus avoid some of the terminus-specific complications described here.

      Minor Comment

      1. The labelling of sgRNA target sites in Figure 1 is inaccurate and should be corrected.

      In Figure 1, the sgRNA target sites are annotated with triangles labelled "PAM synth." The presence of a PAM is necessary but not sufficient to define a target site; the label should therefore be changed to "target site" or an equivalent term. Additionally, the Methods section incorrectly expands PAM as "primary adjacent motif"; the correct expansion is "protospacer adjacent motif." 6. Could the fly crossing scheme in Figure S3 be simplified?

      In the scheme in Fig. S3 the second step seems to be intended to introduce the hs-Flp and vase-Int transgenes. Would it not be possible to already incorporate the Integrase into the swap fly line when it is made and the hs-Flp into the ExTaSy line, thereby saving one generation? 7. Figure 1F has no call out in the main text. 8. Line 155: What was the reason for the low survival rate? Is this likely to be indicative of a problem during marker removal, or a stochastic event as not all fly crosses are always productive (bad food, early death of flies, etc.)? 9. Line 160: What is the N number of "all cases"? 10. Scale bars are missing in Fig. 3g,h. 11. Line 219: The labeling of the panels got mixed up. Panel F does not show an immunostaining. 12. Line 226 and Fig. 3h: It is unclear what area is shown in the inlay. The overview image highlights three POIs, but none seem to fit the inlay. 13. Line 233: Why was the transgenic marker not removed? The authors want to highlight the easy and advantage of marker removal, so leaving in the marker is an odd choice. 14. Line 250: Why was only one isoform of hth tagged? Without a rational this seems to be an odd choice, in particular since the authors seem to suggest in the introduction (Line 38) that a disadvantage of previous technologies is the tagging of only selected isoforms.


      Significance

      ExTaSy assembles a set of well-established tools, namely CRISPR-mediated HDR, piggyBac-based marker excision, and φC31-mediated RMCE, into a unified, single-vector framework for endogenous protein tagging in Drosophila. The individual components have all been described and are in routine use in the field; the conceptual advance is therefore limited. Nevertheless, the integration of these features into a streamlined platform with accompanying automated design software represents a practical contribution that is likely to be of genuine utility to the Drosophila community, particularly for laboratories without specialist transgenesis infrastructure.

      The possibility of tag exchange by fly crossing is the most distinctive feature of the system. However, as discussed above, this is currently demonstrated at only five loci with a single replacement tag, which limits the conclusions that can be drawn about its generality. More broadly, ExTaSy employs well-proven strategies throughout, which is a source of reliability but also means that the study does not incorporate more recent developments in the field. For example, approaches based on single-strand annealing, such as the recently described Seed/Harvest system (Aguilar et al., 2024), can achieve entirely scarless marker removal and thus circumvent the TTAA scar left by piggyBac excision, a limitation the authors themselves acknowledge may reduce expression at modified N-terminal loci. Similarly, the current system is restricted to N- and C-terminal tagging. Given that the goal of endogenous tagging is to minimally perturb protein function, and given the now widespread availability of high-quality protein structure predictions for the Drosophila proteome, a modern tagging platform might be expected to use structural modelling to identify optimal insertion sites irrespective of their location. These are not oversights that diminish the practical value of the current work, but highlight that this study does not always operate at the cutting edge of method development in this area. A brief discussion of these more recent developments in the context of ExTaSy's design choices would usefully situate the work within the broader landscape and help readers understand both what the system offers today and where improvements are likely to come from.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper describes the localisation of DNA repair proteins, which carry out their DNA repair function in the nucleus, to the cytoplasmic Golgi apparatus. Using the Human Protein Atlas to identify candidates, the authors use antibody localisation to show that a significant number of DNA repair proteins also localise at the Golgi. It appears that proteins involved in common DNA repair pathways localise to common regions of the Golgi. The Golgi-nucleus distribution of the DNA repairs proteins changes upon DNA damage, indicating a dynamic relationship. The authors focus on the DNA repair protein RAD51C and show that its loss from the Golgi and translocation to the nucleus upon DNA damage is mediated by the ATM kinase. Anchoring at the Golgi is shown to be mediated by the golgin giantin. A functional role for giantin in DNA repair is shown in knockdown studies, supporting a mechanism whereby Golgi anchoring of RAD51C, and possibly other DNA repair proteins, by giantin, is required to maintain proper control of DNA repair. The data are clear and support the authors' conclusions. The data are carefully quantified throughout. I found the text easy to read.

      • Major points:*

      • 1.) To validate the Golgi localisation, KD using siRNA was used. It was deemed that a signal reduction of 25% was enough to indicate specific antibody labelling. This seems like a low number, and not very stringent. For some of the hits, expressing tagged versions of the proteins would greatly strengthen the Golgi assignment. This may not be possible for all, but for RAD51C would seem an important experiment. *

      Response: We thank the reviewer for raising the important issue of antibody validation stringency. We agree that for a single-candidate study, a larger reduction after knockdown would generally be preferable. In our case, the 25% cutoff was used only in the primary high-content screening step as part of an intentionally inclusive two-stage workflow, for the following reasons:

      First, because this dataset is generated in a screening format across hundreds of targets, knockdown-efficiency, protein turnover, and the relative size of the Golgi associated pool are unknown and highly variable between genes. For many proteins the Golgi pool represents a small fraction of total cellular signal, and a modest change in total abundance can translate into a smaller absolute change in the Golgi ROI after segmentation, background subtraction, and imaging noise. We therefore selected a permissive cutoff to reduce false negatives and ensure we did not systematically miss candidates with slower turnover, partial knockdown, or small Golgi pools. This strategy is consistent with large scale subcellular mapping efforts, including the Human Protein Atlas, where genetic depletion by siRNA is used as a key validation pillar for immunofluorescence localization and is combined with additional validation strategies when deeper confidence is required (Stadler et al, 2012). Furthermore, it is important to note that this validation was performed in a high-content screening format in which fixation, permeabilisation, antibody concentration, and blocking conditions were kept uniform across all candidates rather than optimised for each individual antibody. In standard single-target immunofluorescence experiments, these parameters would be titrated to maximise signal-to-noise for the specific antibody and antigen in question. Under non-optimised screening conditions, the absolute magnitude of signal change upon knockdown is inherently attenuated compared to what would be expected from a purpose-optimised assay. We therefore consider a 25% reduction threshold under these uniform, non-optimised screening conditions to be a meaningful and appropriately calibrated criterion.

      Second, we wish to clarify that the primary intent of our screen was not to validate the Golgi-nuclear localisation of any single protein in isolation, but rather to identify whether entire functional pathways are represented at the two organelles. This is precisely why the bioinformatic network analysis was performed as an integral part of the workflow, and not as an afterthought. The finding that the validated hit list is significantly enriched for coherent functional clusters, most notably a network spanning multiple core DNA repair pathways (HR, MMR, BER, MMEJ) serves as an in silico validation of the dataset as a whole. The emergence of pathway-level organisation, with proteins from the same repair pathways co-associating, localising to the same Golgi sub-compartments, and redistributing in the same direction upon genotoxic stimuli, provides biological coherence that goes beyond what individual antibody validation can offer, and substantially reduces the likelihood that the Golgi signal represents a collection of unrelated false positives.

      Third, our mechanistic conclusions do not rely on the 25% screening threshold. For RAD51C, we used multiple orthogonal validation approaches, including independent antibodies recognizing distinct RAD51C epitopes and genetic depletion, supported by biochemical evidence.

      In response to this comment, we have provided the full screening validation dataset as source data (Supplementary____Table S1), including intensity changes for the candidates, so that readers can inspect the distributions and apply their own thresholds. We have also clarified in the Results section the rationale behind our screening strategy (lines 128-139) and the role of the bioinformatic network analysis as an integral validation step (lines 141-156).

      Turning to the specific suggestion of tagged RAD51C, we fully agree that tagged proteins can provide valuable orthogonal validation. We attempted endogenous tagging using CRISPR-mediated homologous recombination but were unable to obtain viable colonies following editing, consistent with the essential role of RAD51C in homologous recombination. We also attempted ectopic expression of tagged RAD51C but were unable to obtain constructs that preserved physiological expression levels, maintained robust cell viability or produced interpretable localization. This difficulty is not unique to our laboratory: colleagues working on RAD51 paralog complexes have reported that tagging or overexpression of RAD51C perturbs both its localisation and its ability to form functional paralog complexes (Greenhough et al, 2023; Rawal et al, 2023; Somyajit et al, 2015; Berti et al, 2020) all use purified complexes or untagged proteins for functional assays. We discussed these challenges extensively with experts in the DNA damage repair field at several international meetings (EMBO Sounio, Keystone Symposia, German DNA Repair Society). For these reasons, we relied on orthogonal approaches that do not require tagging (genetic depletion plus independent antibodies, and biochemical fractionation) to support the Golgi localization claim. We agree with the reviewer that this represents a limitation of this study, and we addressed these concerns in the discussion of our revised manuscript (lines 630-641).

      *2.) The total signal should be quantified for each DNA repair protein upon genotoxic stress, in addition to the Golgi to nucleus ratio. For many of the proteins it looks like the total signal goes down, which could influence interpretation. *

      Response: __We thank the reviewer for this important point. We wish to clarify that our imaging pipeline uses marker-based segmentation throughout, the Golgi compartment is segmented using GM130 and the nucleus using Hoechst, as unsegmented whole-cell masks without organelle markers yield unreliable intensity measurements in this experimental setup. True total cellular signal is therefore not directly accessible in this dataset. In the revised manuscript we provide the absolute fluorescence intensities for both the Golgi and nuclear compartments separately. In addition, we now include total (Golgi + nuclear) intensity measurements for each protein (__Supplementary Figures 3D, 4D, __and 5E__) as the most reliable proxy for overall protein distribution. These data are presented alongside the redistribution ratio to enable comprehensive interpretation.

      As the reviewer correctly notes, a subset of proteins shows a reduction in total signal after treatment, particularly with doxorubicin. This is consistent with known effects of doxorubicin-induced DNA damage on cellular proteostasis, including widespread ubiquitination and suppression of protein translation (Halim et al, 2018). Several DDR regulators are subject to ubiquitin-dependent turnover following genotoxic stress, such as CHK1 (Zhang et al, 2005). More broadly, ubiquitin and proteasome mediated regulation is an integral component of the DNA damage response and can affect the abundance and detectability of DDR factors (Brinkmann et al, 2015). Changes in abundance are therefore an expected biological feature of the response. For this reason, we used the Golgi-to-nucleus ratio as the primary redistribution readout, as it captures relative compartmental partitioning independently of changes in total protein levels.

      *3.) The study would benefit from live imaging of the Golgi to nucleus translocation of RAD51C. This would give a better indication of dynamics. *

      __Response: __We agree that live imaging would directly visualize the dynamics of RAD51C redistribution between the Golgi and the nucleus. This was indeed one of our initial goals following the identification of the Golgi-associated RAD51C pool. However, as described above in our response to Major Comment 1, live imaging requires a fluorescently tagged RAD51C construct, and all tagging strategies we attempted, both endogenous CRISPR-mediated tagging and ectopic expression, failed to yield cell lines with robust signal while preserving physiological behaviour. This appears to be a broader challenge for highly conserved and functionally constrained DNA repair proteins, and is not unique to our laboratory.

      Given these constraints, we focused on tag-independent approaches: multiple independent RAD51C antibodies combined with genetic depletion controls, quantitative fixed-cell time courses, and biochemical fractionation. These orthogonal datasets together support compartment-specific changes over time in a manner consistent with redistribution. We have clarified this limitation explicitly in the manuscript and avoided any wording that could be interpreted as implying direct single-molecule tracking in live cells. We present this as an important avenue for future work, contingent on the development of viable RAD51C-expressing cell lines (lines 630-641).

      *4.) The double depletion experiments suggest a functional relationship between giantin and RAD51C. But they do not formally show it. Experiments to more directly address the functional role of the interaction between these two proteins would strengthen the study. *

      Response: We agree with the reviewer that double depletion alone cannot formally prove that the physical Giantin-RAD51C interaction is the sole determinant of the observed DDR phenotypes. However, we would like to highlight the breadth of evidence we have assembled in support of this functional relationship:

      • Physical interaction between endogenous Giantin and RAD51C demonstrated by colocalisation (Figure 4F-G) and co-immunoprecipitation (Figure 4H-I).
      • Damage-induced dissociation of the Giantin-RAD51C complex that is prevented by ATM inhibition or Importazole treatment, directly linking the interaction to the DDR signalling axis (Figure 3K-P)
      • Premature nuclear accumulation of RAD51C upon Giantin depletion, producing aberrant nuclear foci lacking canonical HR markers and impaired ATM signalling (Figure 4B-E & J-M)
      • DR-GFP reporter assay confirming that Giantin depletion reduces HR efficiency to approximately 60% of control, consistent with the reduction previously reported in the genome-wide HR screen (Adamson et al. 2012) and validating the functional significance of Giantin in HR (Figure 5L).
      • Partial rescue of ATM phosphorylation, genomic instability and proliferation phenotypes by RAD51C co-depletion, arguing for RAD51C as a functionally relevant conduit of the Giantin-dependent phenotype (Figures 5M-5P). These observations are further supported by the established literature on RAD51C function, its roles in CHK2 phosphorylation, replication fork stabilisation, and RAD51 filament formation (Badie et al, 2009; Somyajit et al, 2015; Prakash et al, 2022) providing a mechanistically coherent framework in which mislocalisation of RAD51C, whether directly or indirectly through Giantin, leads to dysregulation of DDR signalling and repair capacity, as we directly demonstrate with the HR efficiency assay.

      Nonetheless, we fully agree that the most direct proof of the functional relevance of the physical Giantin-RAD51C interaction would come from separation-of-function experiments, ideally using an interaction-deficient Giantin mutant or an RAD51C variant unable to bind Giantin. We wish to be transparent that both approaches face substantial technical barriers in this system. RAD51C tagging consistently compromised cell viability and protein function, precluding the generation of interaction-deficient variants at physiological expression levels. Engineering an interaction-deficient Giantin mutant presents an independent challenge: Giantin is one of the largest Golgi matrix proteins (~376 kDa), composed almost entirely of extended coiled-coil domains that are resistant to structural prediction, and identifying a discrete RAD51C interaction interface without disrupting broader scaffolding function would require a dedicated structural and biochemical programme. We have framed these explicitly as the most important future priorities in the Discussion (lines 555-564), rather than over-interpreting the current data.

      *5.) The Kaplan-Meier plots in Fig S9 seems to be quite selective in that only breast cancer is shown. Does giantin reduction correlate with poor prognosis in other cancers? *

      __Response: __We thank the reviewer for this suggestion. We initially focused on breast cancer because RAD51C is a clinically established hereditary breast and ovarian cancer susceptibility gene (Meindl et al, 2010; Ghannoum et al, 2023), providing direct clinical context for a study centred on RAD51C dynamics and genome stability. We agree however that restricting the survival analysis to a single cancer type can appear selective.

      To address this directly, we expanded the in-silico survival analysis of Giantin (GOLGB1) using GEPIA2 (Tang et al, 2019) across all available TCGA cohorts (overall survival, median cutoff, FDR correction). In the pooled pan-cancer analysis, higher GOLGB1 expression is significantly associated with improved overall survival (HR(high) = 0.75, p = 6.6 × 10⁻¹⁵). When stratified by tumour type, the majority of individual associations do not reach statistical significance. The two most robust statistically significant associations are kidney renal clear cell carcinoma (KIRC; HR(high) = 0.57, p = 3.4 × 10⁻⁴), where high GOLGB1 expression is associated with improved survival, and lower-grade glioma (LGG; HR(high) = 1.5, p = 0.036), where the association is in the opposite direction. A significant association is also observed in thymoma (THYM; HR(high) = 7.3, p = 0.031), though this should be interpreted with caution given the small cohort size (n = 59). Notably, the breast cancer association observed in the KM Plotter analysis (HR = 0.71, p = 1.8 × 10⁻¹¹; n = 4,929) does not reach significance in the TCGA BRCA cohort (HR = 1.1, p = 0.68; n = 1,070), most likely reflecting the substantially smaller sample size of the TCGA cohort, which is approximately 4.6-fold smaller and therefore underpowered to detect a modest effect. These context-dependent associations are consistent with the tumour-type-specific roles of Golgi scaffolding proteins and are discussed accordingly in the revised manuscript.

      In the revised manuscript we have retained the original breast cancer Kaplan-Meier plots and supplemented them with a pan-cancer survival map across all TCGA cohorts (lines 611-625; Figure S9G) and a summary table (Supplementary Table 3) reporting hazard ratios, sample sizes, and p-values for each tumour type, allowing readers to assess the clinical relevance of GOLGB1 expression.

      *Minor points: There are a few grammatical errors here and there. The figures do not appear in the correct order in the text, which makes the early parts of the paper a bit difficult to follow. Some of the figures don't seem to clearly match the text. For example, it is mentioned that RAD51C labelling was done with 3 different antibodies. I could not find this data. *

      Response: __We thank the reviewer for these helpful observations. In the revised manuscript we have (i) carefully proofread the text and corrected grammatical errors throughout; (ii) revised the Results section to ensure that figures and supplementary figures are cited in sequential order and that each panel is explicitly introduced before being discussed, improving readability in the early sections. and (iii) corrected figure callouts to ensure they match the text. In particular, the statement that RAD51C labeling was performed with three different antibodies has been linked to the corresponding figure panels in the Results section. Antibody identifiers, sources, and dilutions are clearly reported in the Methods and in the table in __Supplementary Table S1.

      __ Reviewer #1 (Significance (Required)):__

      *This paper is novel and should be of significant interest to the field. It has important implications for how we think about the Golgi apparatus, and for how DNA repair pathways may be controlled. The pattern is clearly complex, with many DNA repair proteins localising to the Golgi, and some showing opposite dynamics. However, by focussing on RAD51C and giantin, the paper nicely demonstrates a novel mechanism for controlling DNA repair by these proteins. *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Background - Eukaryotic cells rely on tightly regulated DNA repair pathways to preserve genome stability under the constant threat of both endogenous and exogenous genotoxic stress. While the nucleus, and to a lesser extent the mitochondria, is the primary site where DNA damage is detected and repaired, accumulating evidence indicates that extranuclear organelles, particularly the Golgi apparatus, play a surprisingly important role in modulating stress signaling, proteostasis, and the trafficking/activation of key DNA repair factors.

      • Emerging evidence has shown that genotoxic stress can result in a major remodeling of the Golgi apparatus; however, the crosstalk between the Golgi and the nucleus, and its contribution to the DNA damage response, remains poorly defined. The present study offers timely insight by examining the spatiotemporal behavior of DNA repair proteins that shuttle between the Golgi and the nucleus, and how this trafficking contributes to the maintenance of genomic stability.*

      Main findings - The authors employed the Human Protein Atlas (HPA) project to shortlist proteins that might link Golgi-nuclear function and validated each candidate using an siRNA-mediated antibody-validation pipeline, thereby identifying 163 proteins that localize to both the Golgi and the nucleus. Bioinformatic analysis of these candidates revealed a significant enrichment for DNA damage response (DDR) regulators, including multiple factors from core DNA repair pathways, suggesting that a portion of the DDR machinery may reside in the Golgi at steady state. Interestingly, the authors observed that dual-localizing DDR proteins undergo lesion-specific redistribution between the Golgi and the nucleus in response to specific types of DNA injuries. For instance, BER and MMEJ proteins shifted from nucleus to Golgi in response to doxorubicin, whereas MMR and HR proteins redistributed from Golgi to nucleus. This trend was reversed with H2O2 or KBrO3 treatments.

      • To gain further insight into the link between the DDR and Golgi-nuclear communication, the authors focused on the HR factor RAD51C, which also plays a key role during the replicative stress response. The authors noticed that RAD51 is significantly associated with the Golgi, in addition to its known nuclear pool. Interestingly, they demonstrated that doxorubicin triggers the ATM-dependent release of this Golgi-tethered RAD51C pool and its Importin-β-mediated import into the nucleus, where it forms repair-associated foci. They further identified Giantin as the Golgi scaffold that anchors RAD51C at steady state in this subcellular compartment and showed that its depletion leads to premature nuclear accumulation of RAD51C, formation of aberrant RAD51C foci lacking canonical HR markers, reduced ATM activation, elevated genomic instability, and increased cell proliferation. *

      Together, this study revealed an underappreciated and functionally meaningful spatiotemporal level of regulation within the DDR, suggesting that the Golgi, rather than functioning solely as a trafficking organelle, acts as a platform that anchors, releases, and temporally controls the availability of key DNA repair factors in response to genotoxic stress. In particular, the authors demonstrated that the timely and regulated release of RAD51C from the Golgi is essential for maintaining genome stability and is dependent on canonical DDR signaling pathways, including ATM activation and Importin-β-mediated nuclear import.

      • Overall Critique - This manuscript offers a novel and compelling perspective on the regulation of the DDR by positioning the Golgi as an active participant in the spatiotemporal control of DNA repair factors. By integrating multiple experimental layers, including a systematic localization screening, a sub-Golgi mapping, several dynamic redistribution assays, and functional perturbation read-outs, the authors built a strong and coherent case for a biologically meaningful Golgi-nucleus communication axis during the DDR. Therefore, the study is timely and highly relevant for the DNA repair field, with broader implications for our understanding of how subcellular organelles coordinate genome maintenance and cellular homeostasis.

      While the manuscript is clearly written and the figures are coherent and supportive of the main findings of the study, several issues should be addressed to ensure full interpretability and reproducibility.

      Major Comments*

      *1. Limited use of agents causing genotoxic stress - The authors report intriguing lesion-specific shifts in Golgi-nuclear redistribution, yet much of the mechanistic work relies heavily on doxorubicin, a pleiotropic drug that induces diverse forms of DNA damage beyond DSBs. Expanding the core analysis of the study to include a broader panel of mechanistically defined genotoxins (e.g., etoposide, camptothecin, neocarzinostatin, or ionizing radiation) would substantially strengthen the conclusion that the trafficking patterns reflect damage-type specificity rather than drug-specific off-target effects. Such broader analysis would also clarify whether Golgi-nucleus communication responds differentially to replication-associated breaks, Topo II-dependent lesions, oxidative stress, or crosslinks. *

      __Response: __We thank the reviewer for this important point. We would first note that while doxorubicin is indeed pleiotropic, its primary and best-established mechanism of action is the poisoning of Topoisomerase II, leading to DNA double-strand breaks, a mechanism it shares with etoposide (van der Zanden et al, 2021; Thorn et al, 2011). The additional effects of doxorubicin, including reactive oxygen species generation and chromatin remodelling, are well-documented but secondary to this DSB-inducing activity, as we note in the revised manuscript. Nonetheless the goal of this study was not to comprehensively map lesion-specific trafficking for every DDR protein, but rather to establish the existence of a dynamic Golgi-nucleus redistribution axis and then focus mechanistically on the validated targets, in this case RAD51C. The lesion-dependent redistribution patterns are therefore presented as an initial, hypothesis-generating observation emerging from our screening and characterisation framework. A systematic, lesion-by-lesion dissection of redistribution kinetics across the broader DDR network would represent a substantial additional study and is beyond the scope of the present work.

      Importantly, our key mechanistic observations for RAD51C are not restricted to doxorubicin. We tested a panel of genotoxic agents covering mechanistically distinct lesion classes: camptothecin (CPT; Topoisomerase I-associated replication breaks), etoposide (ETO; Topoisomerase II-dependent DSBs), and mitomycin C (MMC; interstrand crosslinks) (Figures S8A-S8I). Across all DSB-inducing agents, RAD51C consistently redistributed from the Golgi to the nucleus, demonstrating that this response is not a doxorubicin-specific off-target effect. Notably, RAD51C did not redistribute in response to oxidative lesions induced by hydrogen peroxide or potassium bromate, consistent with its established role in homologous recombination and DSB repair rather than oxidative damage pathways, as discussed in the manuscript. This lesion-type selectivity provides additional evidence that the Golgi-nuclear redistribution we observe is a biologically specific response rather than a non-selective stress effect.

      *2. Functional implications of RAD51C redistribution for HR efficiency - Although the study convincingly demonstrates a release of RAD51C from the Golgi and its subsequent nuclear foci formation, it remains unclear how this redistribution influences HR efficiency. Incorporating a functional HR assay (e.g., DR-GFP reporter, RAD51 filament assembly, or fork protection assays) would help determine whether Golgi-anchored RAD51C release is directly required for HR or instead primarily modulates upstream DDR signaling. *

      Response: __We thank the reviewer for this important suggestion. We have performed DR-GFP reporter assays to directly assess HR efficiency following Giantin and RAD51C depletion. Depletion of Giantin reduced HR efficiency to approximately 60% of control levels, and RAD51C depletion to approximately 40%, consistent with the HR reduction previously reported in the genome-wide HR screen (Adamson et al, 2012). Co-depletion of Giantin and RAD51C reduced HR to levels comparable to RAD51C depletion alone, suggesting that the effect of Giantin on HR is mediated primarily through RAD51C, consistent with RAD51C being the key effector of the Giantin-dependent spatial regulatory mechanism we describe. These data are included in the revised manuscript (__lines 455-465; Figure 5L).

      *In addition, the manuscript does not fully reconcile how Golgi-tethering of RAD51C fits with its well-established nuclear roles during replication stress, where timely availability of RAD51C is essential for fork stabilization and restart. *

      Response: __We agree that the nuclear function of RAD51C during replication stress is well established and important to reconcile with our findings. Our imaging data consistently show a detectable nuclear RAD51C population at steady state across all cell lines examined, and we do not propose that RAD51C is exclusively Golgi-localised. We suggest that the two pools serve distinct functional purposes: the constitutive nuclear pool supports ongoing replication fork stabilisation and restart, processes that require RAD51C availability independently of acute DNA damage, while the Golgi-tethered fraction represents a damage-responsive reserve that is released acutely upon DSB induction in an ATM-dependent manner. We wish to be transparent that this two-pool model is speculative at present, formally distinguishing the contributions of each pool would require direct labelling of the Golgi-anchored fraction, which was not technically feasible in this system as discussed above. Nonetheless, this model is consistent with established principles of signal-responsive protein sequestration in cell biology, and is directly supported by our Giantin depletion data: premature release of the Golgi pool leads to aberrant nuclear RAD51C foci lacking canonical HR markers and impaired ATM signalling, demonstrating that unscheduled nuclear accumulation is actively detrimental rather than simply redundant. We have added a paragraph to the revised Discussion explicitly framing the two-pool distinction as a working model and identifying direct pool-identity tracking as an important future direction (__lines 566-587).

      *3. Specificity of Giantin-related phenotypes - The phenotypes observed upon Giantin depletion (e.g., increased micronuclei, comet tail moments, impaired ATM signaling, and elevated proliferation) could partially reflect a global dysfunction of the Golgi rather than RAD51C-specific tethering defects. Although co-depletion of RAD51C provides partial rescue, additional controls examining Golgi integrity, trafficking competence, or rescue with siRNA-resistant Giantin would help confirm specificity and distinguish direct from indirect effects. *

      __Response: __We thank the reviewer for raising this important concern, which was a central consideration throughout our investigation. We address it through three complementary lines of evidence.

      First, regarding Golgi structural integrity and trafficking competence: as previously reported, Giantin depletion has not been associated with strong Golgi fragmentation or major morphological alterations (Koreishi et al, 2013; Bergen et al, 2017; Stevenson et al, 2021), and we observed no significant Golgi fragmentation upon Giantin knockdown in our system. Consistent with the literature, Giantin has been implicated in specific cargo trafficking, most notably collagen secretion, rather than general secretory pathway function (Stevenson et al, 2021). To directly confirm that general Golgi trafficking competence was preserved in our experimental system, we performed the VSV-G-YFP trafficking assay (Presley et al, 1997), a well-established functional readout of general secretory trafficking. Giantin depletion did not result in a significant change in trafficking efficiency compared to control siRNA (Rebuttal Figure 1), consistent with the literature and arguing against a general collapse of Golgi function as the basis for the phenotypes observed.

      Rebuttal ____Figure 1. VSV-G-YFP trafficking assay.

      (A) Representative images of cells treated with control siRNA or giantin siRNA. Nuclei are stained with Hoechst. Total VSV-G-YFP (YFP-tsO45G) signal is shown together with antibody staining against VSV-G in non-permeabilized cells to assess cell surface levels. Scale bars, 10 μm.

      (B) Quantification of VSV-G trafficking from two independent biological replicates.

      Second, the phenotypes are RAD51C-dependent and not a generic Golgi dysfunction: the genomic instability and DDR signalling defects we observe upon Giantin depletion are not phenocopied by GMAP210 depletion, another Golgin family member, indicating that the phenotypes are not a generic consequence of Golgin loss. Critically, we now directly demonstrate using the DR-GFP reporter assay that Giantin depletion reduces HR efficiency to approximately 60% of control, and that co-depletion of RAD51C produces no further reduction beyond RAD51C depletion alone, consistent with RAD51C epistasis over Giantin for HR capacity (Figure 5L). This functional epistasis, together with the physical interaction between Giantin and RAD51C by co-immunoprecipitation, their co-localisation within the same Golgi sub-compartment, and the partial rescue of ATM phosphorylation, micronuclei formation and proliferation phenotypes upon RAD51C co-depletion, provides a coherent mechanistic chain linking Giantin specifically to RAD51C-dependent DDR outcomes. While we cannot formally exclude indirect contributions from other Giantin-associated factors, none of our observations are consistent with the phenotype arising from non-specific Golgi perturbation.

      Third, Giantin may play a broader role in connecting DDR signalling to cytoplasmic and Golgi-resident processes, beyond RAD51C tethering alone: we consider this a feature of the biology rather than a confound. Golgins are well established as multi-cargo scaffolding platforms, and Giantin in particular occupies a strategic position where several processes converge: the tethering of DDR factors, the regulation of damage-induced signalling cascades, and the directional trafficking of repair factors between compartments. This would explain why Giantin depletion produces a phenotype that extends beyond what RAD51C co-depletion alone can fully rescue, and is consistent with the pathway-level coherence we observe across our screen. Understanding the full complement of Giantin-associated DDR interactions represents one of the most compelling directions emerging from this work.

      In response to this comment, we have expanded the Discussion (lines 545-565) to explicitly propose that Giantin functions as a broader organisational node coordinating multiple DDR factors, while our data specifically and consistently implicate RAD51C as a primary conduit.

      *4. Positioning of ATM in the Golgi-nuclear signaling - While ATM inhibition prevents RAD51C release, its spatial and mechanistic basis of this regulation remains obscure. It is not clear whether ATM acts locally at the Golgi, through cytoplasmic pools, or indirectly via nuclear feedback signaling. Clarifying or discussing this point in more depth would improve the mechanistic coherence of the proposed model. *

      __Response: __We thank the reviewer for raising this important mechanistic question. The spatial basis of ATM action at the Golgi is indeed an emerging and exciting area of cell biology. A growing body of evidence demonstrates that ATM associates with the Golgi membrane through binding to phosphatidylinositol-4-phosphate (PI4P), and that this Golgi-resident pool modulates the magnitude and kinetics of the nuclear DDR (Ovejero et al, 2023). Importantly, the most recent work in this area demonstrates that Golgi-associated ATM is not merely a passive reservoir but is enzymatically active and capable of phosphorylating Golgi-resident substrates (Soulet et al, 2026), providing a compelling mechanistic basis for how damage-induced ATM signalling could reach the Golgi to license RAD51C release.

      To directly examine whether ATM localises to the Golgi in our system and whether its activation state changes upon DNA damage, we performed a biochemical Golgi enrichment assay using the Minute{trade mark, serif} Golgi Apparatus EnrichmentKit (Cat #: GO-037) to examine ATM distribution across cis- and trans-Golgi fractions. Fraction purity was validated using GM130 (cis-Golgi), TGN46 (trans-Golgi), and HSP60 (membrane fraction) (Rebuttal Figure 2A). This analysis revealed that ATM is detectable in the total membrane fraction and enriched in the cis-Golgi fraction under basal conditions (Rebuttal Figure 2A). Under normal physiological conditions, activated ATM (pATM) was absent from Golgi-enriched fractions (Rebuttal Figure 2B), but was detectable in the cis-Golgi fraction following doxorubicin-induced genotoxic stress (Rebuttal Figure 2C). While these observations are preliminary and require further validation, they are consistent with the emerging literature and raise the intriguing possibility that ATM is recruited to and activated at the Golgi in a damage-dependent manner, where it could act locally to license RAD51C release.

      Rebuttal Figure 2. Biochemical Golgi fractionation confirms ATM enrichment in cis-Golgi compartments.

      *Western blot of HeLa-K fractions enriched for cis- and trans-Golgi membranes, probing for (A) ATM under basal conditions, and (B and C) pATM under basal conditions and (B) pATM (C) after treatment with DOX (40 μM) (markers: GM130 for cis-Golgi, TGN46 for trans-Golgi, HSP60 for membrane fraction (MEM). *

      We consider the precise spatial and mechanistic dissection of ATM signalling at the Golgi and its relationship to nuclear feedback, one of the most exciting directions to emerge from this work, and one that we hope our study has helped to open. We have expanded the Discussion (lines 525-543) accordingly to place our findings in the context of the emerging Golgi-ATM literature and to frame this as an important unresolved question for future investigation.

      *5. RAD51C is examined in silo, without consideration for the BCDX2 complex - RAD51C is exclusively analyzed in isolation, despite its well-established function as part of the BCDX2 paralog complex (RAD51B-RAD51C-RAD51D-XRCC2). Because RAD51C does not normally operate as a standalone factor, it is unclear why only RAD51C, among all paralogs, would be subjected to Golgi tethering, ATM-dependent release, and Importin-β-driven nuclear import. This raises important mechanistic questions: Are other BCDX2 members also Golgi-associated? Do they undergo similar trafficking dynamics? Does Golgi tethering selectively regulate RAD51C, or does the complex translocate together? Addressing these points would greatly strengthen the biological plausibility and mechanistic coherence of the proposed model. *

      Response: We thank the reviewer for raising this important point. We fully agree that RAD51C functions as a core component of the BCDX2 (RAD51B-RAD51C-RAD51D-XRCC2) and CX3 (RAD51C-XRCC3) paralog complexes, and that its canonical roles in HR and replication fork protection occur within these assemblies. Our decision to focus on RAD51C was driven by the screening data: of the DDR proteins identified, RAD51C displayed the most robust Golgi-associated pool, the clearest damage-induced redistribution dynamics, and a tractable anchoring interaction with Giantin that could be interrogated biochemically.

      We would also note that extending this analysis to other RAD51 paralogs is not straightforward with current tools. The available commercial antibodies against RAD51B, RAD51D and XRCC2 perform poorly in immunofluorescence applications, and most localisation studies for these proteins have relied on overexpression of tagged constructs, a strategy that, as discussed above, risks perturbing both localisation and complex assembly. The lack of reliable antibodies for endogenous paralog detection at the resolution required for Golgi localisation analysis represents a genuine technical barrier that we encountered directly during this study.

      Whether Golgi association and ATM-dependent release involve RAD51C alone or extend to other BCDX2 or CX3 members is therefore a genuinely open and important question. We note that our co-immunoprecipitation data were performed on total cell lysate and cannot distinguish whether the Golgi-associated RAD51C is complexed with other paralogs or represents a monomeric subpopulation. Golgins are well established as multi-cargo scaffolding platforms, and it is entirely plausible that Giantin organises a broader paralog module rather than tethering RAD51C as an isolated subunit. A systematic analysis of RAD51 paralogs for Golgi localisation and lesion-dependent trafficking enabled by improved reagents such as proximity labelling or endogenous tagging approaches compatible with essential proteins would determine whether the BCDX2 complex translocates as a unit or whether individual subunits are differentially regulated, with potentially distinct consequences for HR fidelity. We have revised the manuscript accordingly and identify this as an explicit priority for future work in the revised Discussion (lines 583-602).

      Minor Comments

      1. Pathway-specific sub-Golgi localization patterns - The finding that DDR proteins map to distinct cis/trans Golgi subdomains is an interesting and potentially important observation. However, the dataset is limited to 15 proteins, making the proposed pathway-level trends (e.g., HR factors enriched in cis-Golgi; BER/MMEJ factors enriched in trans-Golgi) preliminary. Strengthening this conclusion by increasing the number of DDR proteins analyzed would help determine whether sub-Golgi compartmentalization contributes meaningfully to DNA repair pathway regulation.

      Response: We thank the reviewer for this constructive suggestion. We agree that extending sub-Golgi mapping to a larger number of DDR proteins would be valuable, and we present the current dataset explicitly as a first, hypothesis-generating map rather than a definitive pathway atlas.

      We would like to highlight, however, that the value of this observation lies not simply in the number of proteins mapped, but in the biological coherence of the patterns that emerge. The finding that proteins from the same repair pathway tend to occupy the same Golgi sub-compartment: BER and MMEJ factors enriching in the trans-Golgi, HR factors in the medial/cis-Golgi, and that this sub-compartmental positioning correlates with the direction of their redistribution upon genotoxic stress, is a pattern that would be unlikely to arise by chance across 15 independently validated proteins. This internal consistency argues that the sub-Golgi organisation reflects genuine pathway-level biology rather than noise, even if the dataset is not yet exhaustive. Together with the bioinformatic network analysis, which independently supports pathway-level clustering across the broader validated hit list, these observations reinforce each other as complementary layers of evidence.

      2. Is the Golgi-released RAD51C indeed the pool that enters the nucleus? The major assumption of the study is that the RAD51C population released from the Golgi upon DNA damage is the same pool that subsequently accumulates in the nucleus to form repair foci. While the imaging and fractionation data are consistent with this model, the study does not directly track or distinguish Golgi-derived RAD51C from cytoplasmic or pre-existing nuclear pools. Without a method to specifically label, pulse-chase, or track the Golgi-anchored fraction, it remains formally possible that nuclear RAD51C originates from other subcellular reservoirs.

      __Response: __We thank the reviewer for highlighting this important mechanistic point, which we agree cannot be fully resolved with the current dataset. Several independent lines of evidence are nonetheless consistent with a model in which the Golgi-associated pool contributes directly to damage-induced nuclear accumulation.

      • Our time-resolved imaging demonstrates a reciprocal decrease at the Golgi and a concurrent increase in the nucleus following genotoxic stress, consistent with redistribution rather than independent compartment-specific changes (Figures 3E-3I).
      • Biochemical fractionation provides an orthogonal readout of the same reciprocal shift under identical conditions (Figures 3J and S6D).
      • ATM inhibition simultaneously prevents Golgi loss and blunts nuclear accumulation, while Importin-β perturbation blocks nuclear entry, together supporting an active and regulated translocation route (Figures 3K-3P).
      • Giantin depletion, which releases the Golgi-tethered RAD51C pool prematurely, leads to aberrant nuclear RAD51C foci lacking canonical HR markers and impaired ATM signalling, strongly supporting that the Golgi-tethered fraction has functional consequences in the nucleus consistent with it being the relevant pool (Figures 4B-4E and 4J-4M).
      • In the revised manuscript we have included cytoplasmic RAD51C signal quantification across the doxorubicin time course (Figure 3H). The cytoplasmic signal shows only a moderate and gradual reduction that is kinetically distinct from the sharp Golgi decrease and does not precede the nuclear increase. This pattern is inconsistent with a large pre-existing cytoplasmic reservoir driving the nuclear accumulation; if the cytoplasmic pool were the primary source, one would expect a rapid and prominent cytoplasmic decrease coinciding with or preceding nuclear accumulation, which we do not observe. Instead, the data are more consistent with rapid transit of Golgi-released RAD51C through the cytoplasm rather than stable cytoplasmic accumulation prior to nuclear entry. We acknowledge that definitive pool-identity tracking would require spatially restricted labelling approaches such as Giantin-proximal TurboID or photoactivatable tagging strategies, which are precluded by the technical constraints on RAD51C tagging described above. We have revised the manuscript to avoid overstatement on this point and identify these approaches as important future directions (lines 297-305 & lines 715-719).

      Reviewer #2 (Significance (Required)):

      General assessment - This study presents a novel and conceptually compelling view of the DNA damage response (DDR) by positioning the Golgi apparatus as an active regulator of the spatiotemporal availability of DNA repair factors. The strongest aspects of the work include its integration of a systematic immune-localization screening, a sub-Golgi compartment mapping, dynamic redistribution assays, and functional perturbations to build a coherent model of Golgi-nucleus communication during genotoxic stress. The mechanistic focus on RAD51C provides a clear case study linking organelle-level regulation to genome stability.

      • Advance - To my knowledge, this is the first comprehensive demonstration that the Golgi can serve as a spatiotemporal coordination node for DDR proteins, including those involved in HR. The identification of a substantial pool of RAD51C, and reportedly other DDR factors, anchored within specific Golgi subdomains represents a significant conceptual advance. The demonstration that Golgi-tethered RAD51C is released in an ATM-dependent manner and subsequently participates in nuclear foci formation suggests a previously unrecognized organelle-level regulatory checkpoint in genome maintenance. This work therefore extends current models of the DDR by revealing a layer of intracellular coordination that bridges classical nuclear pathways with cytoplasmic organelle function.*

      • Audience - This study will be of strong interest to a specialized audience in the fields of DNA repair, genome stability, and cell biology, particularly those studying the spatial organization of repair pathways and intracellular stress signaling. It will also appeal to researchers investigating organelle biology, intracellular trafficking, and the broader coordination of cytoplasmic and nuclear responses to stress. Beyond these communities, the work may be relevant to cancer, as it suggests new mechanisms by which organelle perturbations or Golgi-associated scaffolding proteins could influence therapeutic responses or genomic instability.

      Reviewer expertise - Field of expertise: DNA repair, genome stability, organelle biology, cancer cell biology.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *This study investigates the communication between the Golgi complex and the nucleus of the cell, which remains a largely unexplored field. The authors used publicly available siRNA and antibody data from the Human Protein Atlas as a basis for finding overlap between the proteomes of the two cellular compartments. In validating the data from the HPA, the study finds a novel cluster of DNA repair proteins present in the Golgi, which they validate and resolve to sub-compartmental localization. To do so they use immunofluorescence (IF) localization on ¬cis- and trans-Golgi cisternae marked by GM130 and TGN46, respectively. The authors find that many of the fully validated proteins present in both the nucleus and Golgi redistribute between the Golgi and the nucleus dependent on the protein and the type of DNA lesion. They focused on RAD51C, a recombination factor. They show that RAD51C resides in both the ¬cis- and trans- subsections prior to damage and responds to DNA damage in an ATM-dependent manner via release of a Golgi-based pool bound to Giantin, which is then imported into the nucleus via Importin-β. Knockdown experiments showed that Giantin regulates RAD51C spatially and temporally. The work reveals a dynamic interchange of proteins between the Golgi and nucleus that controls cell functions beyond the classic secretory, membrane trafficking, and PTM roles of the Golgi. The authors build on prior work on Golgi impacts on DDR, offering an alternative cellular compartment for storage of DDR factors prior to damage. Overall, the data is timely and relevant, as it finds new roles for the Golgi in DNA damage response (DDR) regulation. The data is largely convincing and well controlled. The IF data is presented in black and white single channels and merged in color, which allows good comparison of the different protein stains. The scope of the initial screen of HPA antibodies and Golgi/Nuclear dual proteomes is impressive, and the overlap of DDR proteins is characterized for fifteen different proteins at a sub-compartmental level. The focus on RAD51C as a member of the HR pathway was a strong choice, and the study presents interesting information on its regulation by Golgi complex members, as well as a feedback look with pATM. The possibility of the Golgi storing specific DDR factors in specific compartments is well-supported and intriguing. There are a few major and minor points that should strengthen the paper and improve clarity prior to publication. *

      Major Comments:

      *1. Much of the strength of the IF data is lost in the choice of scale for presentation of the data. In almost all cases, enlarged sections should be shown of the areas currently indicated by arrow, in all channels. This is done well in Figure 3A, where an area of the Golgi is enlarged and the overlap of RAD51C in the GM130-marked Golgi is clearly visible in the merged channel, even when printed out. I would highly recommend including the white box and enlarged in all images and channels, while keeping the representative fields as is (e.g. if the image is 40mm, draw a 7mm box around representative cells/Golgi, and enlarge to 15mm in the bottom left). This change should be made to F1E, F2F, F3E, F3J, and F3M, as well as having enlarged figures in the corners in all supplementary data IF figures. Where possible, a fully enlarged image of the bounding box could also be included. Some of the IF data would be strengthened by using the nuclei stain to draw a masking outline to include in the black and white channels, to clearly delaminate what is Golgi-localized and what is nuclear. *

      Response: We thank the reviewer for this helpful suggestion and fully agree that enlarged insets substantially improve the visibility of Golgi-localised signal, particularly when figures are printed. We share the reviewer's view that alternative display formats with larger insets would be preferable, and we have implemented enlarged boxed regions wherever space constraints permitted.

      Specifically, we have added boxed regions with enlarged insets to Figure 1E, all panels of Figure 3. For Figure 2, the number of conditions and proteins displayed simultaneously within the constraints of standard journal figure dimensions made it impractical to include enlarged insets for all panels without reducing the overall field size to the point of losing contextual information. We have nonetheless improved the visibility of the Golgi signal in Figure 2 as much as possible within these constraints, and note that the final figure layout will be further optimised in line with the journal's specific formatting guidelines. In addition, all figures have been provided as high-resolution image files to allow electronic magnification, enabling readers to inspect the Golgi-localised signal in detail beyond what is visible in the printed version.

      Regarding the use of nuclear outline masks in single-channel images, we tested this approach but found that given the number of structures present within each field, including Golgi stacks, nuclear foci, and cytoplasmic signal, overlaying nuclear outlines on individual channels added visual complexity that made the images harder rather than easier to interpret. As an alternative, we have included a full-colour merged panel, when possible, which we consider a cleaner way to delineate nuclear versus Golgi-localised signal and allows the reader to directly compare compartment-specific distributions across channels.

        1. *There is a lack of consistency in the representative images shown by IF. For example, Figure 1 gives the impression of very little RAD51C in the nucleus but this is rightly shown to not be the case in Supp. Fig 2A. The same is true of the various images of LIG1. The authors should use representative data that better reflects the distribution of the proteins being studied and maintain consistency across images. If there is a lot of variation in staining patterns, the authors should show images and percentages corresponding to the variations especially for the key gene studied, RAD51C.

      Response: We agree and have replaced the representative IF panels for RAD51C and LIG1 with images that better reflect the quantified distributions across biological replicates. The revised panels were selected to match the quantified compartment intensities shown in the accompanying graphs rather than representing outlier cells. We would also note that the apparent discrepancy between Figure 1E and Supplementary Figure S2A partly reflects a difference in imaging conditions: Supplementary Figure S2A __and __Figure 2F were acquired directly from the high-content screening pipeline under uniform, non-optimised antibody and fixation conditions at widefield resolution, whereas Figure 1E shows representative single optical section confocal images acquired after candidate identification with antibody conditions optimised for each individual protein. The improved signal-to-noise in the optimised confocal images more faithfully captures the dual Golgi and nuclear localisation of RAD51C, and the apparent difference between the two image sets is therefore expected rather than inconsistent. We have updated the figure legends to clarify the imaging modality and conditions for each panel. Furthermore, the quantified distribution of RAD51C across Golgi, nuclear and cytoplasmic compartments across multiple cell lines is shown in Figure 3B and 3D, providing a population-level representation of the dual localisation that complements the representative images shown in Figure 1E.

        1. *The initial screening by siRNA-mediated knockdown pipeline that validated and confirmed dual Golgi and nuclear localization of 163 of the 329 dual-localization HPA proteins does not have any data included. This seems like a very large amount of data to gloss over and not include even as supplementary data. This should be included as source data, and discussion of the in-text information should be strengthened. The data included with the networking of these validated proteins is strong, but the process of elimination and validation has not been shown. In addition, the antibody information included in the supplementary data does not include dilution factors or blocking factors is not included, which would be beneficial to future studies to include.

      Response: We agree and have addressed this in full. We note that the HPA antibody validation data, including immunofluorescence images and siRNA knockdown results, are publicly available for inspection on the Human Protein Atlas website (www.proteinatlas.org) for the majority of candidates, providing an independent layer of verification. In the revised submission, we additionally provide the complete siRNA-mediated validation dataset generated in our laboratory as source data (Table S1; lines 1025-1041), including for each candidate the HPA antibody identifier, gene symbol, Ensembl ID, antibody staining pattern, siRNA identifier, cell number per replicate, and normalised Golgi and nuclear signal ratios for both experimental replicates. This allows readers to inspect the validation metrics directly and apply alternative thresholds if desired. We have also expanded the antibody information to include diluent conditions (4% FBS in 0.1% Triton-X100 for all HPA antibodies used at 2 μg/ml in the screening pipeline), enabling reproducibility and reuse of the dataset by the community.

        1. *The authors should expand upon the paragraph lines 155-162 to include more discussion on Figure S2A and S2B. The expanse of this data is some of the strongest in the paper, and it should be further discussed in-text. Also, the rationale behind the choice in the specific proteins that are included in these analysis / figures is not always clear in -text, and more attention should be spent on the narrowing down of the analysis to the final proteins. This is also especially important as many of the DDR proteins chosen are not the most common DDR proteins. Also note in text that the Golgi marker GM130 (presumably) was used for the screening, which means that some proteins which are only localizing to the TGN46 trans Golgi might have been lost in the validation step (or, explain why this is not the case).

      Response: __We expanded the Results text (__lines 141-163) to discuss Figures S2A and S2B in more depth and clarified the rationale for selecting the final set of DDR proteins taken forward, including considerations of pathway representation, bioinformatic annotations, literature-described roles in DNA repair. We would also note that the identity of the DDR proteins identified in this screen was determined by the HPA dataset and the unbiased validation pipeline rather than by prior assumptions about which repair factors would be present at the Golgi. The presence of less commonly studied DDR factors is therefore a direct reflection of the screen output, and we consider this one of the strengths of the approach.

      We would also like to address the reviewer's concern about potential GM130-based bias directly: at the widefield or confocal resolution used in the high-content screening pipeline, the Golgi apparatus appears as a single perinuclear structure and cis- and trans-Golgi subdomains cannot be resolved. GM130 was therefore used purely as a segmentation marker to define the Golgi compartment as a whole rather than to selectively label the cis-Golgi cisternae. The resulting Golgi mask captures signals from the entire Golgi ribbon, including trans-Golgi regions, meaning that proteins with exclusively trans-Golgi localisation would not have been systematically excluded at the screening stage. Sub-compartmental resolution of cis versus trans localisation was only possible in subsequent analyses using nocodazole-dispersed mini-stacks imaged by confocal microscopy with co-staining for both GM130 and TGN46.

      *5. The relationship between Giantin loss, increased cell proliferation, and elevated endogenous DNA damage as it relates to RAD51C remains insufficiently resolved and requires further clarification. Several of the proliferation assays used are not optimal for addressing changes in cell growth. For example, Figure 5O appears to quantify cell numbers by counting fields from IF images, which is an unconventional approach. This should be done by growth curves, luminescent viability or colony formation assays. In addition, this point will be greatly strengthened by performing rescue experiments for Giantin directly (instead of co-depletion as a means of rescue) and/or using a mutant of RAD51C that does not bind to Giantin. If these additional experiments are beyond the current scope, the conclusions should be softened in the discussion. *

      Response: We thank the reviewer for raising these important points, which we address in turn:

      Giantin-RAD51C relationship and mechanistic interpretation. __We acknowledge that establishing the full causal chain between Giantin loss, RAD51C mislocalisation, elevated endogenous DNA damage and increased cell proliferation is challenging within the scope of a single study, and we discuss this openly in the Discussion (__lines 555-564). Our evidence collectively includes: physical interaction between endogenous Giantin and RAD51C by co-immunoprecipitation (Figures 4H and 4I), premature nuclear accumulation of RAD51C upon Giantin depletion (Figures 4B-4E and 4J-4M), new additional experiment showing direct reduction of HR efficiency in the DR-GFP assay (Figure 5L), impaired ATM signalling (Figures 5J and 5M), elevated genomic instability (Figures 5A-5E), and epistatic rescue by RAD51C co-depletion (Figures 5M-5P). These observations are further contextualised by the established literature on RAD51C function: RAD51C is known to regulate CHK2 phosphorylation and cell cycle checkpoint signalling (Badie et al, 2009), stabilise replication forks (Somyajit et al, 2015), and promote RAD51 filament formation required for DSB repair (Prakash et al, 2015). Dysregulation of these functions through Giantin-dependent mislocalisation provides a mechanistically coherent explanation for the elevated genomic instability and altered proliferation we observe, and is entirely consistent with our model. Together, the experimental evidence and the published biology of RAD51C support a model in which Giantin spatially regulates RAD51C to maintain proper DDR signalling and HR capacity.

      We agree that separation-of-function tools would further strengthen this model and identify these as important future priorities. We wish to note however that both approaches face substantial technical barriers in this system. As described in our response to Reviewer 1 Major Comment 1, RAD51C tagging, whether by CRISPR-mediated endogenous editing or ectopic expression, consistently compromised cell viability and protein function, precluding the generation of interaction-deficient variants at physiological expression levels. Engineering an interaction-deficient Giantin mutant presents an independent and considerable challenge: Giantin is one of the largest Golgi matrix proteins (~376 kDa), composed almost entirely of extended coiled-coil domains that are intrinsically difficult to model structurally, and identifying a discrete interaction interface with RAD51C without disrupting the broader scaffolding function of the protein would require a dedicated structural and biochemical programme. We therefore consider these important but substantial future directions rather than straightforward experimental additions to the current study.

      Proliferation assays. Colony formation assays provide a rigorous readout of long-term proliferative capacity, and these data are presented for single knockdown conditions in Figures 5F-5I. The cell number quantification in Figure 5P was specifically included to assess the double knockdown of Giantin and RAD51C simultaneously, a condition not covered by the colony formation assay. We respectfully note that automated fluorescence microscopy-based nuclear counting is a well-established approach for measuring cell proliferation in siRNA screening contexts. Nuclear counting from high-content imaging has been used as a direct readout of cell growth and proliferation in RNAi screens (Boutros et al, 2004; Martin et al, 2014; Garvey et al, 2016; Mikheeva et al, 2024), and has been shown to produce results comparable to or superior to conventional viability assays including MTT and flow cytometry-based methods (Mikheeva et al, 2024). We have nonetheless clarified in the revised figure legend that Figure 5P reports relative cell number quantified by automated nuclear counting from high-content imaging fields as a secondary concordant measure alongside the colony formation data, rather than a standalone proliferation assay.

      *6. It is unclear from the discussion and from presented data whether proteins are directly transported between the Golgi and the nucleus, or whether they go into the cytoplasm for a transient period, presumably when they could interact with Importin β. There is also some data where cytoplasm signal could be quantified to address this (Figure 3E-I). *

      Response: We thank the reviewer for this mechanistic point. In the revised manuscript we have included cytoplasmic RAD51C signal quantification alongside Golgi and nuclear measurements for the doxorubicin time course (lines 297-305; Figure 3H). The cytoplasmic signal shows a moderate and gradual reduction distinct in both magnitude and kinetics from the sharp Golgi decrease, consistent with a transient cytoplasmic intermediate rather than a stable pool. Regarding the identity of the translocating pool, two observations directly support a Golgi origin. First, Importazole treatment prevents RAD51C release from the Golgi following genotoxic stress and simultaneously reduces nuclear RAD51C foci formation, demonstrating that Importin-β-mediated import is required both for Golgi clearance and for productive nuclear accumulation. Second, Giantin depletion which prematurely releases the Golgi-tethered pool, leads to aberrant nuclear RAD51C foci, directly linking the Golgi-anchored fraction to nuclear accumulation. Together these data support a model in which Golgi-resident RAD51C transits through the cytoplasm for Importin-β-mediated nuclear import. We acknowledge that without direct labelling of the Golgi-anchored fraction, the precise contribution of each subcellular pool to the nuclear accumulation cannot be fully resolved with the current dataset. We discuss the development of appropriate tagging strategies as an important future direction to dissect the dynamics of this process in further detail.

      *7. Statistical analysis on experiments with more than two samples need to be performed with ANOVA and a follow up post-hoc test, not with two-tailed unpaired Student's t-test, which only compares the control and each individual sample. This type of analysis inflates the Type 1 error rates (false positives) in your datasets. For example, the two-tailed unpaired Student's t-test is appropriate in Figure 2F-H, but not in Figure 3 when the samples are timepoints. In this case, a One-way ANOVA with Tukey's post-hoc test (if you want to show all coparisons), or Bonferroni/Sidak if you only need to compare several samples). *

      Response: We agree with the reviewer and thank them for highlighting this important statistical issue. We have revised the statistical analysis for all experiments involving more than two groups to avoid inflation of Type I error rates caused by multiple pairwise Student's t tests. Specifically, for Figures 3F-I, 4C-E, and Figure 5, the data were reanalysed using one way ANOVA followed by the appropriate multiple comparisons post hoc test. The Methods section and corresponding figure legends have been updated to clearly state the statistical tests used for each dataset.

      Minor Comments: General 1. Throughout the text, the reference to many figures and supplementary figures in the same sentence, with little discussion of the data therein makes it hard to follow. In-text referencing is particularly confusing in the section "Dual-localising DDR proteins dynamically redistribute between the Golgi and nucleus in response to specific types of DNA injuries," where the reader is switching between multiple figures and supplementary figures.

      __Response: __We thank the reviewer for this helpful comment. In the revised manuscript, we have improved the readability of the text and revised the figure references to make them clearer. We hope these revisions make the manuscript easier to follow and allow readers to better inspect the figures.

      1. In figures that display technical replicates as individual data points, consider distinguishing each replicate by using different marker shapes (e.g., repeat 1 = upright triangle; repeat 2 = inverted triangle; repeat 3 = diamond). This would provide additional clarity regarding the consistency and repeatability of each technical repeat.

      __Response: __We thank the reviewer for this suggestion. We have updated the data presentation to distinguish biological replicates using different marker shapes in datasets where replicate tracking is of particular relevance to the interpretation. For datasets where individual replicate values are already clearly separable, we have maintained the existing presentation to avoid unnecessary visual complexity.

      1. Make sure all western blot data includes the marker size (F3C and F5L has none, F4H/I have size of proteins not size of markers).

      __Response: __We added missing marker sizes to our western blot data in the revised manuscript.

      1. Be consistent with use of capitalization in figure legends and graph/figure labels.

      __Response: __We made sure that the capitalisation is consistent in figure legends, graph and figure legends in the revised manuscript.

      Figure 2

      In Figure 2A, please include in the figure itself that GM130 is the cis Golgi, and TGN46 is the trans Golgi (Figures should not be dependent on the text for full understanding).

      __Response: __We revised Figure 2A and 2C to label GM130 as cis-Golgi and TGN46 as trans-Golgi within the figure, making it self-explanatory.

      1. Why are LRIG2 and LRRIQ3 not included in the 2E cis vs trans Golgi data, when all other proteins from F1D are included? Include, or comment on in-text.

      __Response: __Both LRIG2 and LRRIQ3 are included in 2E in both the original and revised manuscript.

      1. Be sure to include scale bar data in each figure legend (F2A-E is currently missing it), and include updated scales included in the enlarged data.

      __Response: __Scale bar data is now included in each figure legend in the revised manuscript.

      1. In Figure 2F, make sure that the merged green channel is presented at the same intensity as it is in the single black and white channel, as the green looks very overexposed in several of the merged (CCAR1 DMSO merged is the most noticeable).

      __Response: __We agree and thank you for pointing this out. We have now revised the images and corrected the issue by updating all image panels in the figure.

      1. In Figure 2G, include the grey label in the figure legend.

      __Response: __We thank the reviewer for this comment. The grey label has now been included in the figure legend in the revised manuscript.

      1. In Figure 2G-H, the method of data presentation in the graphs coupled with the statistical analysis is confusing and should be expanded upon in the legend.

      __Response: __We agree that the amount of data presented may appear overwhelming. In the revised figure, we have adjusted the placement of the statistical annotations to improve clarity. Also, we improved the figure legend, to make the figure easier to read and interpret.

      Figure 3

      Figure E/F/G: Is there cytoplasmic quantification as well? Your rationale is that the Golgi RAD51C goes into the nucleus, but via the cytoplasm (due to Importin β import); do you see the cytoplasmic levels increase? Or is it too dilute to notice a difference? At least, this omission needs to be mentioned in-text.

      Figure H/I also include the quantification of the cytoplasmic fraction. It is mentioned in-text on line 272, but not quantified. This comes up as a big question: Do the proteins go directly between the Golgi and nucleus, or do they go through the cytoplasm?

      __Response: __We thank the reviewer for both of these related points. As described in our response to Major Comment 6 above, we have added cytoplasmic RAD51C signal quantification to the doxorubicin time course in the revised manuscript (Figure 3H) and discuss the implications for the proposed translocation route.

      Figure 3A, 3E, and if the data is present for 3J and 3M, could all benefit from using the nuclei staining as a mask to draw an outline around the nucleus in the other channels, and then show a merge in full color instead of a nuclei-only channel. Also note from the major comments, that this data especially is so small to see without enlarged images.

      __Response: __We thank the reviewer for this suggestion. Regarding nuclear outline masks, we tested this approach but found that the number of structures present in each field, including Golgi stacks, nuclear foci and cytoplasmic signal, made overlaid outlines visually confusing rather than clarifying. We have instead included a full-colour merged panel in Figure 3E, which we consider a cleaner way to distinguish nuclear from Golgi-localised signal while preserving the spatial context of the data.

      Regarding image size, we have added enlarged insets to Figures 3E, 3J and 3M in the revised manuscript. We have chosen to display multiple cells per panel rather than a single enlarged cell in order to capture the heterogeneity of the cell population, which we consider important for an accurate representation of the data. All figures have been provided as high-resolution image files to allow electronic magnification, enabling detailed inspection of the signal beyond what is visible in the printed version. We acknowledge that the constraints of standard journal figure dimensions limit how large individual panels can be, and the final layout will be optimised in line with the journal's formatting guidelines.

      *In-text discussion of the results from Figure 3 has an in-depth discussion of the NLS and NES in RAD51C, but this is not followed up on with site-directed mutagenesis or any data; perhaps move this to the discussion instead of results section. *

      __Response: __We have removed the discussion of the NLS and NES from the Results section.

      Figure 4

      Comments from earlier figures hold, with size of enlarged events and using the nuclei as an outline in the single channels. E.g. Figure 4F arrows appear to point to nothing at the chosen scale. The zoom in 4G is insufficient, as the chosen feature is so small it is not even visible in full fields.

      __Response: __We thank the reviewer for this comment. The arrows in Figure 4F indicate individual nocodazole-dispersed Golgi mini-stacks, which are displayed at higher magnification in Figure 4G. The full field in Figure 4F is intentionally shown to illustrate the degree of Golgi dispersion achieved by nocodazole treatment, a context that may be unfamiliar to readers outside the Golgi field, before zooming into a single representative mini-stack in Figure 4G for the cisternal localisation analysis.

      • Figure 4H and 4I need to show the size of the markers *

      __Response: __The size of the markers are now included in the revised manuscript.

      *The representative image in 4L for siGiantin pATM has no pATM foci, while the quantification in 4M has a reduction from ~50% to ~25%, so this image is not representative of this data, or the data quantification is not as strong as the actual data. *

      __Response: __We thank the reviewer for this observation. We wish to clarify that the quantification in Figure 4M reports the mean percentage of RAD51C foci co-localising with pATM across the entire cell population from three independent biological replicates. A reduction from ~50% to ~25% therefore reflects a population-level shift in co-localisation frequency, not that every individual cell shows exactly 25% co-localisation. Given the inherent cell-to-cell variability in foci number and co-localisation, individual cells will span a range of values around this mean, and the representative image shown in Figure 4L reflects one such cell.

      Figure 5

      *Figure 5A has overexposure of the nuclei stain in order to visualize micronuclei. Readjust the levels, and enlarge the images for better visualization. (is this DAPI-stained? Please label). *

      __Response: __The display levels of the nuclear stain in Figure 5A are intentionally set to allow visualisation of micronuclei, which are significantly dimmer than the main nucleus and would not be detectable at display settings optimised for the primary nuclear signal. This is standard practice in micronuclei quantification studies and is necessary to accurately identify and score these structures. The nuclear stain is Hoechst 33342, and this has been explicitly labelled in the revised figure legend.

      *Figure 5A-C: Figure 5A does not show siRAD51, but it is included in the DMSO only graph. Please either show RAD51 data in 5A and 5C, or do not include in 5B. If the DMSO and ETO experiments were performed separately and that accounts for this discrepancy, then show separately. *

      __Response: __We thank the reviewer for this observation. The siRAD51C condition is included in Figure 5B as an internal positive control, consistent with its well-established role in genome stability. RAD51C depletion combined with etoposide treatment resulted in severe cellular toxicity and insufficient cell numbers for reliable quantification, and this condition was therefore excluded from Figure 5C. This has been clarified in the revised figure legend.

      *Figure 5M the white label is difficult to see in the green box. *

      __Response: __We have updated the label colour in Figure 5M to improve visibility against the green background in the revised manuscript.

      * Supplementary Figures*

      Consider reordering/ subdividing supplementary figures for ease of reference during reading.

      Response: We thank the reviewer for this suggestion. The current supplementary figure structure was intentionally designed to minimise the total number of supplementary figures and maintain a logical correspondence with the main figures, avoiding a situation where readers need to navigate an extensive supplementary section, a concern the reviewer raised regarding figure presentation. We believe the current organisation achieves a reasonable balance between completeness and accessibility.

      SF1 and SF2A: Include enlarged boxes or full images so that data is visible.

      __Response: __As described in our response to Major Comment 1, all figures have been provided as high-resolution image files to allow electronic magnification. Space constraints within standard journal figure dimensions preclude the addition of enlarged insets to all supplementary panels without substantially reducing the contextual field of view.

      *SF3A, SF4A, and SF5A: Include enlarged images, include nuclei marker if possible (otherwise, the nuclear intensity is not proven nuclear). *

      Response: We appreciate the suggestion, but adding enlarged insets and nuclei markers to all panels in Figures S3A, S4A and S5A would disproportionately increase the length and complexity of the supplementary section, making it harder rather than easier to navigate. The nuclear intensity measurements are derived from automated segmentation of the Hoechst channel using CellProfiler, which reliably defines nuclear boundaries independently of the antibody channel, and are therefore not dependent on visual confirmation of nuclear localisation in each representative image.

      *SF3B-C, SF4B-C, and SF5 B-D: Change the data presentation in the same method as changed for F2G-H. *

      Response: We have updated the figure legends for Figures S3B-C, S4B-C and S5B-D to improve readability.

      SF3D: List proteins in the same order as in B and C.

      Response: The proteins in Figure S3D are listed in the same order as in Figures S3B and S3C.

      SF6D: Label M N and C more clearly. Include size labels.

      Response: We have added clearer labels for the membrane (M), nuclear (N) and cytoplasmic (C) fractions and included molecular weight size markers in the revised Figure S6D.

      *SF7A-B: Include enlarged. *

      Response: We respectfully note that the purpose of Figures S7A-B is to display the overall cellular response to inhibitor treatments across the cell population, rather than to highlight specific subcellular structures. Enlarged insets would reduce the number of cells visible per panel and would not add scientific value in this context. The Golgi and nuclear signals are clearly visible at the chosen magnification.

      *SF8: Include arrows as in previous experiments, include enlarge. *

      Response: Arrows have been added to Figure S8 to indicate Golgi and nuclear RAD51C signal, consistent with the annotation style used in the main figures. The images already show two representative cells per condition to maximise the visible detail at the chosen scale.

      *SF9G: G is labelled, but not included. *

      Response: Figure S9G has been added in the revised manuscript, showing the pan-cancer overall survival map for GOLGB1 expression across all TCGA cohorts generated using GEPIA2. The figure legend has been updated accordingly.

      *Reviewer #3 (Significance (Required)): *

      * The work finds new roles for the Golgi in regulation of DNA damage responses and the screen could be an important dataset (but results need to be made available) for the DNA repair community. The scope of the initial screen of HPA antibodies and Golgi/Nuclear dual proteomes is impressive, and the overlap of DDR proteins is characterized for fifteen different proteins at a sub-compartmental level. The work provides important insights into RAD51C regulation, however, there are key mechanistic insights and control experiments missing from the studies involving RAD51C and Giantin, dampening its impact. The idea of an alternative cellular compartment for storage of DDR factors prior to damage is interesting, and suggests the spatial regulation of specific lesion responses are stored in specific sub-compartments of the Golgi, which could contribute to repair regulation.*

      References:

      Adamson B, Smogorzewska A, Sigoillot FD, King RW & Elledge SJ (2012) A genome-wide homologous recombination screen identifies the RNA-binding protein RBMX as a component of the DNA-damage response. Nat Cell Biol 14: 318-328

      Badie S, Liao C, Thanasoula M, Barber P, Hill MA & Tarsounas M (2009) RAD51C facilitates checkpoint signaling by promoting CHK2 phosphorylation. J Cell Biol 185: 587-600

      Bergen DJM, Stevenson NL, Skinner REH, Stephens DJ & Hammond CL (2017) The Golgi matrix protein giantin is required for normal cilia function in zebrafish. Biol Open 6: 1180-1189

      Berti M, Teloni F, Mijic S, Ursich S, Fuchs J, Palumbieri MD, Krietsch J, Schmid JA, Garcin EB, Gon S, et al (2020) Sequential role of RAD51 paralog complexes in replication fork remodeling and restart. Nat Commun 11: 3531

      Boutros M, Kiger AA, Armknecht S, Kerr K, Hild M, Koch B, Haas SA, Paro R, Perrimon N & Heidelberg Fly Array Consortium (2004) Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science 303: 832-835

      Brinkmann K, Schell M, Hoppe T & Kashkar H (2015) Regulation of the DNA damage response by ubiquitin conjugation. Front Genet 6: 98

      Garvey CM, Spiller E, Lindsay D, Chiang C-T, Choi NC, Agus DB, Mallick P, Foo J & Mumenthaler SM (2016) A high-content image-based method for quantitatively studying context-dependent cell population dynamics. Sci Rep 6: 29752

      Ghannoum S, Fantini D, Zahoor M, Reiterer V, Phuyal S, Leoncio Netto W, Sørensen Ø, Iyer A, Sengupta D, Prasmickaite L, et al (2023) A combined experimental-computational approach uncovers a role for the Golgi matrix protein Giantin in breast cancer progression. PLoS Comput Biol 19: e1010995

      Greenhough LA, Liang C-C, Belan O, Kunzelmann S, Maslen S, Rodrigo-Brenni MC, Anand R, Skehel M, Boulton SJ & West SC (2023) Structure and function of the RAD51B-RAD51C-RAD51D-XRCC2 tumour suppressor. Nature619: 650-657

      Halim VA, García-Santisteban I, Warmerdam DO, van den Broek B, Heck AJR, Mohammed S & Medema RH (2018) Doxorubicin-induced DNA damage causes extensive ubiquitination of ribosomal proteins associated with a decrease in protein translation. Mol Cell Proteomics 17: 2297-2308

      Koreishi M, Gniadek TJ, Yu S, Masuda J, Honjo Y & Satoh A (2013) The golgin tether giantin regulates the secretory pathway by controlling stack organization within Golgi apparatus. PLoS One 8: e59821

      Martin HL, Adams M, Higgins J, Bond J, Morrison EE, Bell SM, Warriner S, Nelson A & Tomlinson DC (2014) High-content, high-throughput screening for the identification of cytotoxic compounds based on cell morphology and cell proliferation markers. PLoS One 9: e88338

      Meindl A, Hellebrand H, Wiek C, Erven V, Wappenschmidt B, Niederacher D, Freund M, Lichtner P, Hartmann L, Schaal H, et al (2010) Germline mutations in breast and ovarian cancer pedigrees establish RAD51C as a human cancer susceptibility gene. Nat Genet 42: 410-414

      Mikheeva AM, Bogomolov MA, Gasca VA, Sementsov MV, Spirin PV, Prassolov VS & Lebedev TD (2024) Improving the power of drug toxicity measurements by quantitative nuclei imaging. Cell Death Discov 10: 181

      Ovejero S, Kumanski S, Soulet C, Azarli J, Pardo B, Santt O, Constantinou A, Pasero P & Moriel-Carretero M (2023) A sterol-PI(4)P exchanger modulates the Tel1/ATM axis of the DNA damage response. EMBO J 42: e112684

      Prakash R, Rawal Y, Sullivan MR, Grundy MK, Bret H, Mihalevic MJ, Rein HL, Baird JM, Darrah K, Zhang F, et al(2022) Homologous recombination-deficient mutation cluster in tumor suppressor RAD51C identified by comprehensive analysis of cancer variants. Proc Natl Acad Sci U S A 119: e2202727119

      Prakash R, Zhang Y, Feng W & Jasin M (2015) Homologous recombination and human health: the roles of BRCA1, BRCA2, and associated proteins. Cold Spring Harb Perspect Biol 7: a016600

      Presley JF, Cole NB, Schroer TA, Hirschberg K, Zaal KJM & Lippincott-Schwartz J (1997) ER-to-Golgi transport visualized in living cells. Nature 389: 81-85

      Rawal Y, Jia L, Meir A, Zhou S, Kaur H, Ruben EA, Kwon Y, Bernstein KA, Jasin M, Taylor AB, et al (2023) Structural insights into BCDX2 complex function in homologous recombination. Nature 619: 640-649

      Somyajit K, Saxena S, Babu S, Mishra A & Nagaraju G (2015) Mammalian RAD51 paralogs protect nascent DNA at stalled forks and mediate replication restart. Nucleic Acids Res 43: 9835-9855

      Soulet C, Catalan J & Moriel-Carretero M (2026) The DNA Damage Response kinase ATM restricts Golgi extension. bioRxiv

      Stadler C, Hjelmare M, Neumann B, Jonasson K, Pepperkok R, Uhlén M & Lundberg E (2012) Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy. J Proteomics 75: 2236-2251

      Stevenson NL, Bergen DJM, Lu Y, Prada-Sanchez ME, Kadler KE, Hammond CL & Stephens DJ (2021) Correction: Giantin is required for intracellular N-terminal processing of type I procollagen. J Cell Biol 220

      Tang Z, Kang B, Li C, Chen T & Zhang Z (2019) GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res 47: W556-W560

      Thorn CF, Oshiro C, Marsh S, Hernandez-Boussard T, McLeod H, Klein TE & Altman RB (2011) Doxorubicin pathways: pharmacodynamics and adverse effects. Pharmacogenet Genomics 21: 440-446

      van der Zanden SY, Qiao X & Neefjes J (2021) New insights into the activities and toxicities of the old anticancer drug doxorubicin. FEBS J 288: 6095-6111

      Zhang Y-W, Otterness DM, Chiang GG, Xie W, Liu Y-C, Mercurio F & Abraham RT (2005) Genotoxic stress targets human Chk1 for degradation by the ubiquitin-proteasome pathway. Mol Cell 19: 607-618

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study investigates the communication between the Golgi complex and the nucleus of the cell, which remains a largely unexplored field. The authors used publicly available siRNA and antibody data from the Human Protein Atlas as a basis for finding overlap between the proteomes of the two cellular compartments. In validating the data from the HPA, the study finds a novel cluster of DNA repair proteins present in the Golgi, which they validate and resolve to sub-compartmental localization. To do so they use immunofluorescence (IF) localization on ¬cis- and trans-Golgi cisternae marked by GM130 and TGN46, respectively. The authors find that many of the fully validated proteins present in both the nucleus and Golgi redistribute between the Golgi and the nucleus dependent on the protein and the type of DNA lesion. They focused on RAD51C, a recombination factor. They show that RAD51C resides in both the ¬cis- and trans- subsections prior to damage and responds to DNA damage in an ATM-dependent manner via release of a Golgi-based pool bound to Giantin, which is then imported into the nucleus via Importin-β. Knockdown experiments showed that Giantin regulates RAD51C spatially and temporally. The work reveals a dynamic interchange of proteins between the Golgi and nucleus that controls cell functions beyond the classic secretory, membrane trafficking, and PTM roles of the Golgi. The authors build on prior work on Golgi impacts on DDR, offering an alternative cellular compartment for storage of DDR factors prior to damage. Overall, the data is timely and relevant, as it finds new roles for the Golgi in DNA damage response (DDR) regulation. The data is largely convincing and well controlled. The IF data is presented in black and white single channels and merged in color, which allows good comparison of the different protein stains. The scope of the initial screen of HPA antibodies and Golgi/Nuclear dual proteomes is impressive, and the overlap of DDR proteins is characterized for fifteen different proteins at a sub-compartmental level. The focus on RAD51C as a member of the HR pathway was a strong choice, and the study presents interesting information on its regulation by Golgi complex members, as well as a feedback look with pATM. The possibility of the Golgi storing specific DDR factors in specific compartments is well-supported and intriguing. There are a few major and minor points that should strengthen the paper and improve clarity prior to publication.

      Major Comments:

      1. Much of the strength of the IF data is lost in the choice of scale for presentation of the data. In almost all cases, enlarged sections should be shown of the areas currently indicated by arrow, in all channels. This is done well in Figure 3A, where an area of the Golgi is enlarged and the overlap of RAD51C in the GM130-marked Golgi is clearly visible in the merged channel, even when printed out. I would highly recommend including the white box and enlarged in all images and channels, while keeping the representative fields as is (e.g. if the image is 40mm, draw a 7mm box around representative cells/Golgi, and enlarge to 15mm in the bottom left). This change should be made to F1E, F2F, F3E, F3J, and F3M, as well as having enlarged figures in the corners in all supplementary data IF figures. Where possible, a fully enlarged image of the bounding box could also be included. Some of the IF data would be strengthened by using the nuclei stain to draw a masking outline to include in the black and white channels, to clearly delaminate what is Golgi-localized and what is nuclear.
      2. There is a lack of consistency in the representative images shown by IF. For example, Figure 1 gives the impression of very little RAD51C in the nucleus but this is rightly shown to not be the case in Supp. Fig 2A. The same is true of the various images of LIG1. The authors should use representative data that better reflects the distribution of the proteins being studied and maintain consistency across images. If there is a lot of variation in staining patterns, the authors should show images and percentages corresponding to the variations especially for the key gene studied, RAD51C.
      3. The initial screening by siRNA-mediated knockdown pipeline that validated and confirmed dual Golgi and nuclear localization of 163 of the 329 dual-localization HPA proteins does not have any data included. This seems like a very large amount of data to gloss over and not include even as supplementary data. This should be included as source data, and discussion of the in-text information should be strengthened. The data included with the networking of these validated proteins is strong, but the process of elimination and validation has not been shown. In addition, the antibody information included in the supplementary data does not include dilution factors or blocking factors is not included, which would be beneficial to future studies to include.
      4. The authors should expand upon the paragraph lines 155-162 to include more discussion on Figure S2A and S2B. The expanse of this data is some of the strongest in the paper, and it should be further discussed in-text. Also, the rationale behind the choice in the specific proteins that are included in these analysis / figures is not always clear in -text, and more attention should be spent on the narrowing down of the analysis to the final proteins. This is also especially important as many of the DDR proteins chosen are not the most common DDR proteins. Also note in text that the Golgi marker GM130 (presumably) was used for the screening, which means that some proteins which are only localizing to the TGN46 trans Golgi might have been lost in the validation step (or, explain why this is not the case).
      5. The relationship between Giantin loss, increased cell proliferation, and elevated endogenous DNA damage as it relates to RAD51C remains insufficiently resolved and requires further clarification. Several of the proliferation assays used are not optimal for addressing changes in cell growth. For example, Figure 5O appears to quantify cell numbers by counting fields from IF images, which is an unconventional approach. This should be done by growth curves, luminescent viability or colony formation assays. In addition, this point will be greatly strengthened by performing rescue experiments for Giantin directly (instead of co-depletion as a means of rescue) and/or using a mutant of RAD51C that does not bind to Giantin. If these additional experiments are beyond the current scope, the conclusions should be softened in the discussion.
      6. It is unclear from the discussion and from presented data whether proteins are directly transported between the Golgi and the nucleus, or whether they go into the cytoplasm for a transient period, presumably when they could interact with Importin β. There is also some data where cytoplasm signal could be quantified to address this (Figure 3E-I).
      7. Statistical analysis on experiments with more than two samples need to be performed with ANOVA and a follow up post-hoc test, not with two-tailed unpaired Student's t-test, which only compares the control and each individual sample. This type of analysis inflates the Type 1 error rates (false positives) in your datasets. For example, the two-tailed unpaired Student's t-test is appropriate in Figure 2F-H, but not in Figure 3 when the samples are timepoints. In this case, a One-way ANOVA with Tukey's post-hoc test (if you want to show all coparisons), or Bonferroni/Sidak if you only need to compare several samples).

      Minor Comments:

      General

      1. Throughout the text, the reference to many figures and supplementary figures in the same sentence, with little discussion of the data therein makes it hard to follow. In-text referencing is particularly confusing in the section "Dual-localising DDR proteins dynamically redistribute between the Golgi and nucleus in response to specific types of DNA injuries," where the reader is switching between multiple figures and supplementary figures.
      2. In figures that display technical replicates as individual data points, consider distinguishing each replicate by using different marker shapes (e.g., repeat 1 = upright triangle; repeat 2 = inverted triangle; repeat 3 = diamond). This would provide additional clarity regarding the consistency and repeatability of each technical repeat.
      3. Make sure all western blot data includes the marker size (F3C and F5L has none, F4H/I have size of proteins not size of markers).
      4. Be consistent with use of capitalization in figure legends and graph/figure labels.

      Figure 2

      1. In Figure 2A, please include in the figure itself that GM130 is the cis Golgi, and TGN46 is the trans Golgi (Figures should not be dependent on the text for full understanding).
      2. Why are LRIG2 and LRRIQ3 not included in the 2E cis vs trans Golgi data, when all other proteins from F1D are included? Include, or comment on in-text.
      3. Be sure to include scale bar data in each figure legend (F2A-E is currently missing it), and include updated scales included in the enlarged data.
      4. In Figure 2F, make sure that the merged green channel is presented at the same intensity as it is in the single black and white channel, as the green looks very overexposed in several of the merged (CCAR1 DMSO merged is the most noticeable).
      5. In Figure 2G, include the grey label in the figure legend.
      6. In Figure 2G-H, the method of data presentation in the graphs coupled with the statistical analysis is confusing and should be expanded upon in the legend.

      Figure 3

      1. Figure E/F/G: Is there cytoplasmic quantification as well? Your rationale is that the Golgi RAD51C goes into the nucleus, but via the cytoplasm (due to Importin β import); do you see the cytoplasmic levels increase? Or is it too dilute to notice a difference? At least, this omission needs to be mentioned in-text.
      2. Figure H/I also include the quantification of the cytoplasmic fraction. It is mentioned in-text on line 272, but not quantified. This comes up as a big question: Do the proteins go directly between the Golgi and nucleus, or do they go through the cytoplasm?
      3. Figure 3A, 3E, and if the data is present for 3J and 3M, could all benefit from using the nuclei staining as a mask to draw an outline around the nucleus in the other channels, and then show a merge in full color instead of a nuclei-only channel. Also note from the major comments, that this data especially is so small to see without enlarged images.
      4. In-text discussion of the results from Figure 3 has an in-depth discussion of the NLS and NES in RAD51C, but this is not followed up on with site-directed mutagenesis or any data; perhaps move this to the discussion instead of results section.

      Figure 4

      1. Comments from earlier figures hold, with size of enlarged events and using the nuclei as an outline in the single channels. E.g. Figure 4F arrows appear to point to nothing at the chosen scale. The zoom in 4G is insufficient, as the chosen feature is so small it is not even visible in full fields.
      2. Figure 4H and 4I need to show the size of the markers
      3. The representative image in 4L for siGiantin pATM has no pATM foci, while the quantification in 4M has a reduction from ~50% to ~25%, so this image is not representative of this data, or the data quantification is not as strong as the actual data.

      Figure 5

      1. Figure 5A has overexposure of the nuclei stain in order to visualize micronuclei. Readjust the levels, and enlarge the images for better visualization. (is this DAPI-stained? Please label).
      2. Figure 5A-C: Figure 5A does not show siRAD51, but it is included in the DMSO only graph. Please either show RAD51 data in 5A and 5C, or do not include in 5B. If the DMSO and ETO experiments were performed separately and that accounts for this discrepancy, then show separately.
      3. Figure 5M the white label is difficult to see in the green box.

      Supplementary Figures

      1. Consider reordering/ subdividing supplementary figures for ease of reference during reading.
      2. SF1 and SF2A: Include enlarged boxes or full images so that data is visible.
      3. SF3A, SF4A, and SF5A: Include enlarged images, include nuclei marker if possible (otherwise, the nuclear intensity is not proven nuclear).
      4. SF3B-C, SF4B-C, and SF5 B-D: Change the data presentation in the same method as changed for F2G-H.
      5. SF3D: List proteins in the same order as in B and C.
      6. SF6D: Label M N and C more clearly. Include size labels.
      7. SF7A-B: Include enlarged.
      8. SF8: Include arrows as in previous experiments, include enlarge.
      9. SF9G: G is labelled, but not included.

      Significance

      The work finds new roles for the Golgi in regulation of DNA damage responses and the screen could be an important dataset (but results need to be made available) for the DNA repair community. The scope of the initial screen of HPA antibodies and Golgi/Nuclear dual proteomes is impressive, and the overlap of DDR proteins is characterized for fifteen different proteins at a sub-compartmental level. The work provides important insights into RAD51C regulation, however, there are key mechanistic insights and control experiments missing from the studies involving RAD51C and Giantin, dampening its impact. The idea of an alternative cellular compartment for storage of DDR factors prior to damage is interesting, and suggests the spatial regulation of specific lesion responses are stored in specific sub-compartments of the Golgi, which could contribute to repair regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Background - Eukaryotic cells rely on tightly regulated DNA repair pathways to preserve genome stability under the constant threat of both endogenous and exogenous genotoxic stress. While the nucleus, and to a lesser extent the mitochondria, is the primary site where DNA damage is detected and repaired, accumulating evidence indicates that extranuclear organelles, particularly the Golgi apparatus, play a surprisingly important role in modulating stress signaling, proteostasis, and the trafficking/activation of key DNA repair factors.

      Emerging evidence has shown that genotoxic stress can result in a major remodeling of the Golgi apparatus; however, the crosstalk between the Golgi and the nucleus, and its contribution to the DNA damage response, remains poorly defined. The present study offers timely insight by examining the spatiotemporal behavior of DNA repair proteins that shuttle between the Golgi and the nucleus, and how this trafficking contributes to the maintenance of genomic stability.

      Main findings - The authors employed the Human Protein Atlas (HPA) project to shortlist proteins that might link Golgi-nuclear function and validated each candidate using an siRNA-mediated antibody-validation pipeline, thereby identifying 163 proteins that localize to both the Golgi and the nucleus. Bioinformatic analysis of these candidates revealed a significant enrichment for DNA damage response (DDR) regulators, including multiple factors from core DNA repair pathways, suggesting that a portion of the DDR machinery may reside in the Golgi at steady state. Interestingly, the authors observed that dual-localizing DDR proteins undergo lesion-specific redistribution between the Golgi and the nucleus in response to specific types of DNA injuries. For instance, BER and MMEJ proteins shifted from nucleus to Golgi in response to doxorubicin, whereas MMR and HR proteins redistributed from Golgi to nucleus. This trend was reversed with H2O2 or KBrO3 treatments.

      To gain further insight into the link between the DDR and Golgi-nuclear communication, the authors focused on the HR factor RAD51C, which also plays a key role during the replicative stress response. The authors noticed that RAD51 is significantly associated with the Golgi, in addition to its known nuclear pool. Interestingly, they demonstrated that doxorubicin triggers the ATM-dependent release of this Golgi-tethered RAD51C pool and its Importin-β-mediated import into the nucleus, where it forms repair-associated foci. They further identified Giantin as the Golgi scaffold that anchors RAD51C at steady state in this subcellular compartment and showed that its depletion leads to premature nuclear accumulation of RAD51C, formation of aberrant RAD51C foci lacking canonical HR markers, reduced ATM activation, elevated genomic instability, and increased cell proliferation.

      Together, this study revealed an underappreciated and functionally meaningful spatiotemporal level of regulation within the DDR, suggesting that the Golgi, rather than functioning solely as a trafficking organelle, acts as a platform that anchors, releases, and temporally controls the availability of key DNA repair factors in response to genotoxic stress. In particular, the authors demonstrated that the timely and regulated release of RAD51C from the Golgi is essential for maintaining genome stability and is dependent on canonical DDR signaling pathways, including ATM activation and Importin-β-mediated nuclear import.

      Overall Critique - This manuscript offers a novel and compelling perspective on the regulation of the DDR by positioning the Golgi as an active participant in the spatiotemporal control of DNA repair factors. By integrating multiple experimental layers, including a systematic localization screening, a sub-Golgi mapping, several dynamic redistribution assays, and functional perturbation read-outs, the authors built a strong and coherent case for a biologically meaningful Golgi-nucleus communication axis during the DDR. Therefore, the study is timely and highly relevant for the DNA repair field, with broader implications for our understanding of how subcellular organelles coordinate genome maintenance and cellular homeostasis.

      While the manuscript is clearly written and the figures are coherent and supportive of the main findings of the study, several issues should be addressed to ensure full interpretability and reproducibility.

      Major Comments

      1. Limited use of agents causing genotoxic stress - The authors report intriguing lesion-specific shifts in Golgi-nuclear redistribution, yet much of the mechanistic work relies heavily on doxorubicin, a pleiotropic drug that induces diverse forms of DNA damage beyond DSBs. Expanding the core analysis of the study to include a broader panel of mechanistically defined genotoxins (e.g., etoposide, camptothecin, neocarzinostatin, or ionizing radiation) would substantially strengthen the conclusion that the trafficking patterns reflect damage-type specificity rather than drug-specific off-target effects. Such broader analysis would also clarify whether Golgi-nucleus communication responds differentially to replication-associated breaks, Topo II-dependent lesions, oxidative stress, or crosslinks.
      2. Functional implications of RAD51C redistribution for HR efficiency - Although the study convincingly demonstrates a release of RAD51C from the Golgi and its subsequent nuclear foci formation, it remains unclear how this redistribution influences HR efficiency. Incorporating a functional HR assay (e.g., DR-GFP reporter, RAD51 filament assembly, or fork protection assays) would help determine whether Golgi-anchored RAD51C release is directly required for HR or instead primarily modulates upstream DDR signaling.

      In addition, the manuscript does not fully reconcile how Golgi-tethering of RAD51C fits with its well-established nuclear roles during replication stress, where timely availability of RAD51C is essential for fork stabilization and restart. 3. Specificity of Giantin-related phenotypes - The phenotypes observed upon Giantin depletion (e.g., increased micronuclei, comet tail moments, impaired ATM signaling, and elevated proliferation) could partially reflect a global dysfunction of the Golgi rather than RAD51C-specific tethering defects. Although co-depletion of RAD51C provides partial rescue, additional controls examining Golgi integrity, trafficking competence, or rescue with siRNA-resistant Giantin would help confirm specificity and distinguish direct from indirect effects. 4. Positioning of ATM in the Golgi-nuclear signaling - While ATM inhibition prevents RAD51C release, its spatial and mechanistic basis of this regulation remains obscure. It is not clear whether ATM acts locally at the Golgi, through cytoplasmic pools, or indirectly via nuclear feedback signaling. Clarifying or discussing this point in more depth would improve the mechanistic coherence of the proposed model. 5. RAD51C is examined in silo, without consideration for the BCDX2 complex - RAD51C is exclusively analyzed in isolation, despite its well-established function as part of the BCDX2 paralog complex (RAD51B-RAD51C-RAD51D-XRCC2). Because RAD51C does not normally operate as a standalone factor, it is unclear why only RAD51C, among all paralogs, would be subjected to Golgi tethering, ATM-dependent release, and Importin-β-driven nuclear import. This raises important mechanistic questions: Are other BCDX2 members also Golgi-associated? Do they undergo similar trafficking dynamics? Does Golgi tethering selectively regulate RAD51C, or does the complex translocate together? Addressing these points would greatly strengthen the biological plausibility and mechanistic coherence of the proposed model.

      Minor Comments

      1. Pathway-specific sub-Golgi localization patterns - The finding that DDR proteins map to distinct cis/trans Golgi subdomains is an interesting and potentially important observation. However, the dataset is limited to 15 proteins, making the proposed pathway-level trends (e.g., HR factors enriched in cis-Golgi; BER/MMEJ factors enriched in trans-Golgi) preliminary. Strengthening this conclusion by increasing the number of DDR proteins analyzed would help determine whether sub-Golgi compartmentalization contributes meaningfully to DNA repair pathway regulation.
      2. Is the Golgi-released RAD51C indeed the pool that enters the nucleus? The major assumption of the study is that the RAD51C population released from the Golgi upon DNA damage is the same pool that subsequently accumulates in the nucleus to form repair foci. While the imaging and fractionation data are consistent with this model, the study does not directly track or distinguish Golgi-derived RAD51C from cytoplasmic or pre-existing nuclear pools. Without a method to specifically label, pulse-chase, or track the Golgi-anchored fraction, it remains formally possible that nuclear RAD51C originates from other subcellular reservoirs.

      Significance

      General assessment - This study presents a novel and conceptually compelling view of the DNA damage response (DDR) by positioning the Golgi apparatus as an active regulator of the spatiotemporal availability of DNA repair factors. The strongest aspects of the work include its integration of a systematic immune-localization screening, a sub-Golgi compartment mapping, dynamic redistribution assays, and functional perturbations to build a coherent model of Golgi-nucleus communication during genotoxic stress. The mechanistic focus on RAD51C provides a clear case study linking organelle-level regulation to genome stability.

      Advance - To my knowledge, this is the first comprehensive demonstration that the Golgi can serve as a spatiotemporal coordination node for DDR proteins, including those involved in HR. The identification of a substantial pool of RAD51C, and reportedly other DDR factors, anchored within specific Golgi subdomains represents a significant conceptual advance. The demonstration that Golgi-tethered RAD51C is released in an ATM-dependent manner and subsequently participates in nuclear foci formation suggests a previously unrecognized organelle-level regulatory checkpoint in genome maintenance. This work therefore extends current models of the DDR by revealing a layer of intracellular coordination that bridges classical nuclear pathways with cytoplasmic organelle function.

      Audience - This study will be of strong interest to a specialized audience in the fields of DNA repair, genome stability, and cell biology, particularly those studying the spatial organization of repair pathways and intracellular stress signaling. It will also appeal to researchers investigating organelle biology, intracellular trafficking, and the broader coordination of cytoplasmic and nuclear responses to stress. Beyond these communities, the work may be relevant to cancer, as it suggests new mechanisms by which organelle perturbations or Golgi-associated scaffolding proteins could influence therapeutic responses or genomic instability.

      Reviewer expertise - Field of expertise: DNA repair, genome stability, organelle biology, cancer cell biology.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper describes the localisation of DNA repair proteins, which carry out their DNA repair function in the nucleus, to the cytoplasmic Golgi apparatus. Using the Human Protein Atlas to identify candidates, the authors use antibody localisation to show that a significant number of DNA repair proteins also localise at the Golgi. It appears that proteins involved in common DNA repair pathways localise to common regions of the Golgi. The Golgi-nucleus distribution of the DNA repairs proteins changes upon DNA damage, indicating a dynamic relationship. The authors focus on the DNA repair protein RAD51C and show that its loss from the Golgi and translocation to the nucleus upon DNA damage is mediated by the ATM kinase. Anchoring at the Golgi is shown to be mediated by the golgin giantin. A functional role for giantin in DNA repair is shown in knockdown studies, supporting a mechanism whereby Golgi anchoring of RAD51C, and possibly other DNA repair proteins, by giantin, is required to maintain proper control of DNA repair.

      The data are clear and support the authors' conclusions. The data are carefully quantified throughout. I found the text easy to read.

      Major points:

      1. To validate the Golgi localisation, KD using siRNA was used. It was deemed that a signal reduction of 25% was enough to indicate specific antibody labelling. This seems like a low number, and not very stringent. For some of the hits, expressing tagged versions of the proteins would greatly strengthen the Golgi assignment. This may not be possible for all, but for RAD51C would seem an important experiment.
      2. The total signal should be quantified for each DNA repair protein upon genotoxic stress, in addition to the Golgi to nucleus ratio. For many of the proteins it looks like the total signal goes down, which could influence interpretation.
      3. The study would benefit from live imaging of the Golgi to nucleus translocation of RAD51C. This would give a better indication of dynamics.
      4. The double depletion experiments suggest a functional relationship between giantin and RAD51C. But they do not formally show it. Experiments to more directly address the functional role of the interaction between these two proteins would strengthen the study.
      5. The Kaplan-Meier plots in Fig S9 seems to be quite selective in that only breast cancer is shown. Does giantin reduction correlate with poor prognosis in other cancers?

      Minor points: There are a few grammatical errors here and there. The figures do not appear in the correct order in the text, which makes the early parts of the paper a bit difficult to follow. Some of the figures don't seem to clearly match the text. For example, it is mentioned that RAD51C labelling was done with 3 different antibodies. I could not find this data.

      Significance

      This paper is novel and should be of significant interest to the field. It has important implications for how we think about the Golgi apparatus, and for how DNA repair pathways may be controlled. The pattern is clearly complex, with many DNA repair proteins localising to the Golgi, and some showing opposite dynamics. However, by focussing on RAD51C and giantin, the paper nicely demonstrates a novel mechanism for controlling DNA repair by these proteins.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-02954

      Corresponding author(s): Ana-Maria Lennon-Duménil and Sandra Iden

      1. General Statements [optional]

      We thank the three reviewers for the time and caution taken to assess our manuscript, and for their constructive feedback that will help improve the study. We herewith provide a revised manuscript that addressed the key points raised by the reviewers.

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary: The manuscript by Delgado et al. reports the role of the actin remodeling Arp2/3 complex in the biology of Langerhans cells, which are specialized innate immune cells of the epidermis. The study is based on a conditional KO mouse model (CD11cCre;Arpc4fl/fl), in which the deletion of the Arp2/3 subunit ArpC4 is under the control of the myeloid cell specific CD11c promoter.

      In this model, the assembly of LC networks in the epidermis of ear and tail skin is preserved when examining animals immediately after birth (up to 1 week). Subsequently however LCs from ArpC4-deleted mice start displaying morphological aberrations (reduced elongation and number of branches at 4 weeks of age). Additionally, a profound decline in LC numbers is reported in the skin of both the ear and tail of young adult mice (8-10 weeks).

      To explore the cause of such decline, the authors then opt for the complementary in vitro study of bone-marrow derived DCs, given the lack of a model to study LCs in vitro. They report that ArpC4 deletion is associated with aberrantly shaped nuclei, decreased expression of the nucleoskeleton proteins Lamin A/C and B1, nuclear envelop ruptures and increased DNA damage as shown by γH2Ax staining. Importantly, they provide evidence that the defects evoked by ArpC4 deletion also occur in the LCs in situ (immunofluorescence of the skin in 4-week old mice).

      Increased DNA damage is further documented by staining differentiating DCs from ArpC4-deleted mice with the 53BP1 marker. In parallel, nuclear levels of DNA repair kinase ATR and recruitment of RPA70 (which recruits ATR to replicative forks) are reduced in the ArpC4-deleted condition. In vitro treatment of DCs with the topoisomerase II inhibitor etoposide and the Arp2/3 inhibitor CK666 induce comparable DNA damage, as well as multilobulated nuclei and DNA bridges. The authors conclude that the ArpC4-KO phenotype might stem, at least in part, from a defective ability to repair DNA damages occurring during cell division.

      The study in enriched by an RNA-seq analysis that points to an increased expression of genes linked to IFN signaling, which the authors hypothetically relate to overt activation of innate nucleic acid sensing pathways.

      The study ends by an examination of myeloid cell populations in ArpC4-KO mice beyond LCs. Skin cDC2 and cDC2 subsets display skin emigration defects (like LCs), but not numerical defects in the skin (unlike LCs). Myeloid cell subsets of the colon are also present in normal numbers. In the lungs, interstitial and alveolar macrophages are reduced, but not lung DC subsets. Collectively, these observations suggest that ArpC4 is essential for the maintenance of myeloid cell subsets that rely on cell division to colonize or to self-maintain within their tissue of residency (including LCs).

      MAJOR COMMENTS

      1. ArpC4 and Arp2/3 expression The authors argue that LCs from Arpc4KO mice should delete the Arpc4 gene in precursors that colonize the skin around birth. It would be important to show it to rule out the possibility that the lack of phenotype (initial seeding, initial proliferative burst) in young animals (first week) could be related to an incomplete deletion of ArpC4 expression. Also important would be to show what is happening to the Arp2/3 complex in LCs from Arpc4KO mice.

      __Response: __We thank this reviewer for the careful assessment of our manuscript. Regarding this specific comment, we would like to clarify that we do not expect ArpC4 to be deleted in LC precursors, as CD11c is only expressed once the cells have entered the epidermis. Instead, we expect the deletion to take place after birth around day 2-4 (Chorro et al., 2009). For this reason, we performed a deletion PCR of epidermal cells at postnatal day 7 (P7), a time at which the proliferative burst occurs. This analysis revealed CD11c-Cre-driven recombination in the ArpC4 locus (Fig. S2C). This experiment indicates that ArpC4 deletion does not alter LC proliferation and postnatal network formation.

      We apologize if this was not clear enough and have (1) revised the manuscript text to clearly explain the time at which ArpC4 will be deleted early during development when using the CD11c-Cre transgene, and (2) better emphasized the rationale for the deletion PCR (page 4).

      In the in vitro studies with DCs, the level of ArpC4 and Arp2/3 deletion at the protein level is also not documented.

      __Response: __We have previously analyzed the expression of ArpC4 in BMDCs in a recent study, confirming its loss in CD11c-Cre;ArpC4fl/fl cells at the protein level: Rivera et al. Immunity 2022; doi: 10.1016/j.immuni.2021.11.008. PMID: 34910930 (Fig. S2D). Therefore, in the current manuscript we only refer to that paper (Results, first paragraph).

      The authors explain that surface expression of the CD11c marker, which drives Arpc4 deletion, gradually increased during differentiation of DCs: from 50% to 90% of the cells. Does that mean that loss of ArpC4 expression is only effective in a fraction of the cells examined before day 10 of differentiation (e.g. in the RNA-seq analysis)?

      __Response: __The reviewer is correct, there is heterogeneity in CD11c expression, which is inherent of this DC culture model, implying that Arpc4 gene deletion will be partial. However, despite this, we were able to detect significant differences between the transcriptome of control and CD11c-Cre;ArpC4fl/fl DCs in early phases during differentiation, emphasizing that the phenotype of ArpC4 loss is robust.

      We have included a notion on this heterogeneity in the revised manuscript text (page 5).

      Intra-nuclear versus extra-nuclear activities of Arp2/3

      The authors favor a model whereby intra-nuclear ArpC4 helps maintaining nuclear integrity during proliferation of DCs (and possibly LCs). However, multiple pools of Arp2/3 have been described and accordingly, multiple mechanisms may account for the observed phenotype: i) cytoplasmic pool to drive the protrusions sustaining the assembly of the LC network and its connectivity with keratinocytes ; ii) peri-nuclear pool to protect the nucleus ; iii) Intra-nuclear pool to facilite DNA repair mechanisms e.g. by stabilizing replicative forks (the scenario favored by the authors).

      __Response: __The referee is correct, and this is discussed in our manuscript (page 11, upper paragraph): we cannot exclude that several pools of branched actin are influencing the phenotype we here describe.

      Unfortunately, we have previously tested several antibodies against ArpC4, but in our hands, and despite comprehensive optimization, they did not yield specific signals that would enable us to assess changes in subcellular localization in murine cells. Upon this reviewer's comment, we have now reassessed the available tools. We have tested an antibody against ArpC2 (Millipore, Anti-p34-Arc/ARPC2, 07-227-I-100UG), which however did not produce any specific signals either. Instead, we found an ArpC5 antibody that yielded a filamentous staining in the cytoplasm plus nuclear staining in distinct foci of control bone marrow-derived DCs, indicating that Arp2/3 components may in principle act in the nucleus in these cells (see revised Figure S3F,G).

      It is recommended that the authors try to gather more supportive data to sustain the intra-nuclear role. Documenting ArpC4 presence in the nucleus would help support the claim. It could be combined with treatments aiming at blocking proliferation in order to reinforce the possibility that a main function of ArpC4 is to protect proliferating cells by favoring DNA repair inside the nucleus.

      __Response: __We thank this reviewer for this very helpful comment. As outlined in the previous response, we have aimed at obtaining subcellular localization data for Arp2/3 complex components, and along with that study a potential intranuclear localization. Beyond that, in comparison to commonly cultured cell types, however, we face two hurdles addressing the nuclear Arp2/3 role in full: 1) Due to poor transduction rates and epigenetic silencing, we cannot sufficiently express exogenous constructs such as ArpC4-NLS in DCs to assess the subcellular localization of Arp2/3 complex components. 2) We have performed preliminary tests to block proliferation in DCs, using the cyclin D kinase 1 inhibitor RO3306 at different concentrations and incubation times during DC differentiation. Unfortunately, most cells were found dead after treatment. Further lowering the inhibitor concentrations (below 3.5uM) will likely not block the cell cycle, rendering this approach unsuited.

      As mentioned above, we have tested the suitability of additional antibodies directed against Arp2/3 complex components to assess their subcellular localization, with the aim to discriminate peripheral cytoplasmic vs. perinuclear vs. intranuclear localization. These new data that report nuclear and cytoplasmic ArpC5 in control DCs are now presented in revised figure S3F,G. In addition, we toned down our current phrasing in the discussion, also emphasizing the possibility that cytoplasmic or perinuclear pools of the complex may indirectly help maintain the integrity of the genome in LCs (page 12).

      Nuclear envelop ruptures

      The nuclear envelop ruptures are not sufficiently documented (how many cells were imaged? quantification?). The authors employ STED microscopy to examine Lamin B1 distribution. The image shown in Figure 4A does not really highlight the nuclear envelop, but rather the entire content. Whether it is representative is questionable. We would expect Lamin B1 staining intensity to be drastically reduced given the quantification shown in Figure 3D. In addition, although the authors have stressed in the previous figure that Arpc4-KO is associated with nucleus shape aberrations, the example shown in Figure 4A is that of a nucleus with a normal ovoid shape.

      It is recommended to quantify the ruptures with Lap2b antibodies (or another staining that would better delineate the envelop) in order to avoid the possible bias due to the reduced staining intensity of Lamin B1.

      __Response: __NE ruptures are quantified by imaging NLS-GFP-expressing DCs in microchannels to visualize leakage of their nuclear content (Fig. 4B,C). The STED image mentioned by the referee (Fig. 4A,D) was only shown to further illustrate examples of NE ruptures, here using Lamin B1 as an immunofluorescence marker for the NE. We do agree with the reviewer that it was not chosen optimally to represent the ArpC4KO phenotype regarding nuclear shape and Lamin B1.

      We have now provided representative examples of nuclear illustrations of the ArpC4KO phenotype vs. control cells. In addition, we performed STED microscopy of Lap2b immunostained DCs as suggested by the referee (revised Fig. 4A,B).

      A missing analysis is that of nuclear envelop ruptures as a function of nucleus deformations.

      __Response: __As stated in the manuscript (page 5, third paragraph), the morphology of DCs is quite heterogeneous. As mentioned above, nuclear rupture events were quantified by live-imaging of NLS-GFP expressing DCs, enabling the tracing of rupture events. Live imaging is the only robust manner to measure nuclear membrane rupture events as they are transient due to rapid membrane repair (Raab et al. Science 2016). The NLS-GFP label itself, however, is not accurate enough to also quantify nuclear deformations. The latter therefore was quantified after cell fixation, using DAPI and/or immunostaining for NE envelope markers (Figures 3 and S3).

      As suggested by the referee, we have now quantified nuclear deformations using Lap2b staining of the nuclear envelope (revised Fig. 4A,B), demonstrating reduced circularity and increased elongation of ArpC4KO nuclei.

      Fig 4B-C: same frequency of Arpc4-KO and WT cells displaying nuclear envelop ruptures in the 4-µm channels; however image show a rupture for the Arpc4-KO and no rupture for the WT cells (this is somehow misleading). Are ruptures similar in Arpc4-KO and WT cells in this condition?

      __Response: __We apologize for choosing an image that does not represent well our quantification, our mistake. The revised manuscript now contains an image that better reflects our quantification (revised Fig. 4C).

      Fig 4D-E: is their a direct link between nuclear envelop ruptures and ƴH2A.X?

      __Response: __At present, we can only correlate the findings of increased gH2Ax and elevated events of nuclear envelope ruptures in ArpC4KO DCs. Rescue experiments are very difficult to impossible in DCs (e.g. restoring Lamin A/C and B1 levels in the KOs and subsequently assessing the amount of DNA damage). While we are afraid that we cannot address a potential link between NE ruptures and DNA damage by experiments in a manner feasible within this manuscript's revision, we have discussed this interesting aspect based on observations in immortalized cell culture systems (page 10). However, we would like to note that this was indeed shown for different cell types in Nader et al. Cell 2021. This effect results from access of cytosolic nuclease Trex1 to nuclear DNA. We have added this point in our revised manuscript (page 11).

      Interesting (but optional) would be to understand what is happening to DNA, histones? Is their evidence for leakage in the cytoplasm?

      __Response: __This is an interesting question. To assess this, we have now performed immunostainings for double-stranded DNA in the cytoplasm, following published protocols (Spada et al., 2019; PMID 31727239). This analysis revealed significantly increased cytoplasmic dsDNA in ArpC4KO DCs (revised Fig. 4G,H), indeed suggesting leakage into the cytoplasm following ArpC4 loss.

      RNA seq analysis

      The RNA-seq analysis suffers from a lack of direct connection with the rest of the study. The extracted molecular information is not validated nor further explored. It remains very descriptive. The PCA analysis suggests a « more pronounced transcriptomic heterogeneity in differentiating Arpc4KO DCs ». However it seems difficult to make such a claim from the comparison of 3 mice per group. In addition, such heterogeneity is not seen in the more detailed analysis (Fig 5F). The authors claim that « day 10 control and Arpc4KO DCs showed no to very little differences in gene expression, in contrast to cells at days 7-9 of differentiation ». This is not obvious from the data displayed in the corresponding figure. In addition, it is not expected that cells that may take a divergent differentiation path at days 7-9 may would return to a similar transcriptional activity at day 10.

      A point that is not discussed is that before day 10 of DC differentiation, Arpc4 KO is expected to only occur in about 50% of the cell population. This is expected to impact the RNA-seq analysis.

      Not all clusters have been exploited (e.g. cluster 3 elevated, cluster 6 partly reduced). I suggest the authors reconsider their analysis and analysis of the RNA-seq analysis (or eventually invest in complementary analysis).

      __Response: __Despite a comprehensive analysis of the different transcriptomes of control and ArpC4 mutant cells during DC differentiation, we decided to focus the presentation and discussion of our RNAseq results on the most notable findings. Of these, the elevated innate immune responses in ArpC4KO DCs (Fig. 5E,H) caught our particular attention, as this seemed highly meaningful in light of DC and LC functions.

      As suggested by the referee, in the revised manuscript, we better connected the RNAseq data to the other cellular and molecular analyses shown, complementing these results by investigating the potential involvement of innate immune responses in the ArpC4KO phenotype (page 7).

      What causes the profound numerical drop of LC in the epidermis?

      A major open question is what causes the massive drop of LCs. Although differentiating Arpc4KO DCs start accumulating DNA damage upon proliferation, they succeed in progressing through the cell cycle. There is even a slightly elevated expression of cell cycle genes at day 7 of differentiation in the DC model.

      Only a trend for increased apoptosis is observed in ear and tail skin. It would be important to provide complementary data documenting increased death (or aberrant emigration?) of LCs in the 4-8 week time window.

      __Response: __We agree with the reviewer that this is an important question. We exclude that elevated emigration causes the decline of LCs in ArpC4KO epidermis, as ArpC4-mutant LCs are significantly reduced (and not increased) in number in skin-draining lymph nodes (Fig. 7E). To assess whether increased cell death contributed to LC loss, we have tried to identify LCs that are just about to die. As the reviewer noted, we could only observe a trend of apoptosis-positive LCs in mutant epidermis. We assume that this is because of a quick elimination of compromised LCs following DNA damage, with only a short time passing until LCs with impaired genome integrity will be cleared from the system, making it very difficult to detect gH2Ax-positive cells that are positive for markers of cell death.

      Despite these limitations to detect DNA-damage-positive but viable LCs in vivo, we have now collected 6-week-old mice to analyze LC numbers and apoptosis (cleaved Caspase-3), complementing our data derived from 7-day and 4-week-old mice (Figures S2A,B,E,F). While we did observe the expected trends for reduced LC numbers and increased DNA damage of ArpC4KO LCs as seen in adolescent mice, we were unable to detect a significant increase of apoptotic LCs in ArpC4KO animals at 6 weeks of age (revised Suppl. Fig. 4A-D). We assume that this is due to the outlined short-lived stages of apoptotic cells. Alternatively, it seems possible that ArpC4KO LCs were lost via cell death pathways other than apoptosis, a matter which we feel is beyond the scope of this manuscript. Accordingly, we revised our discussion to include this possibility (page 11-12).

      Functional consequences

      Although the study reports novel aspects of LC biology, the consequence of ArpC4 deletion for skin barrier function and immunosurveillance are not investigated. It would seem very relevant to test how this model copes with radiation, chemical and/or microorganism challenges.

      __Response: __We fully agree with this reviewer that this is a very interesting point. Therefore, next to assessing the steady-state circulation of LCs and DCs, we also addressed the consequence of ArpC4 loss for LC function in chemically challenged skin: we performed skin painting experiments using the contact sensitizer fluorescein isothiocyanate (FITC), diluted in the sensitizing agent dibutyl phthalate (DBP), to detect cutaneous-derived phagocytes within draining lymph nodes. These experiments revealed that migration of ArpC4KO LCs (as well as of ArpC4KO DCs) to skin-draining lymph nodes was impaired (Fig. 7C-E), confirming an in vivo role of ArpC4 for immune cell migration to lymphatic organs following a chemical challenge. The revised manuscript contains a more detailed note to properly explain the FITC painting experiment and highlight its importance (page 9).

      MINOR COMMENTS:

      1- Figure 1D

      Gating strategy: twice the same empty plots. The content seems to be missing... Does this need to be shown in the main figure?

      __Response: __We apologize for this problem that might be due to file conversion of PDF reader software. In our PDF versions (including the published bioRxiv preprint) we do see the data points; however, we have earlier experienced incomplete FACS plots during manuscript preparation.

      For the revised manuscript, we double-checked the results after converting the figures into PDFs. Here is a screenshot:

      2- Figure 2

      Best would be to keep same scale to compare P1 and P7 (tail skin, figure 2A)

      Response: We have replaced the examples with micrographs of comparable scale (revised Fig. 2A).

      Overlay of Ki67 and MHC-II does not allow to easily visualize the double-positive cells (Fig 2C)

      Response: We now provided single-channel image next to the merged view and improved the visualization of double-positive cells (revised Fig. 2C).

      Quality of Ki67 staining different for Arpc4-KO (less intense, less focused to the nuclei): a technical issue or could that reflect something?

      Response: We thank the reviewer for spotting this. We have re-assessed all Ki67 micrographs and noted that the originally chosen examples indeed were not fully representative. We have selected more representative examples of Ki67-positive cells in control and mutant tissues, reflecting no difference in the principal nature of Ki67 staining (revised Fig. 2C).

      Fig 2C: Panels mounted differently for ear and tail skin (different order to present the individual stainings, Dapi for tail skin only).

      Response: We agree and have harmonized the sequence of panels in figure 2 accordingly (revised Fig. 2C).

      3- LC branch analysis (Fig 1 and 2)

      While Fig 1 indicates that ear skin LCs form in average twice as few branches as tail skin LCs (3-4 versus 8-9 branches per cell), Fig 2 shows the opposite (10-12 versus 6-7 branches per cell).

      Is this due to a very distinct pattern between the 2 considered ages (4 weeks versus 8-10 weeks)? Could the author double-check that there is no methodological bias in their analysis?

      Response: We thank the reviewer for hinting to this apparent inconsistency. Indeed, our initial analysis suffered from a bias in detecting LC dendrites, as the tissue cellularity and overall morphology significantly differs between 4-week-old and adult animals: In adult animals, the immunostainings showed a higher baseline background signal for the skin epithelium compared to P28. We had noted this beforehand and had adjusted the imaging pipeline accordingly, with a more stringent thresholding to eliminate background signals in the case of adult tissues. While we were able to detect the described ArpC4 phenotype, this strategy resulted in a reduced ability to detect dendrites (both in control and mutant tissues), explaining the seemingly reduced number of dendrites in adult vs. 4-week-old tissues.

      We have double-checked both the micrographs and the corresponding quantifications and did not identify errors. Instead, our assumption -that a too high stringency for background reduction in adults caused the discrepancy- turned out correct. We now performed detailed analyses of LC morphology at 4-week and adult stages by confocal microscopy, using a 63x objective rather than a 40x objective as done previously. The new results confirm that with this approach the number of LC dendrites across these ages are largely comparable, while the phenotypes of ArpC4 loss are retained. The revised manuscript now contains a completely new analysis based on image acquisition with a 63x objective (revised Fig. 1E-G).

      4- Fig 3 E-G

      How many animals were examined (n=5)? Reproducible accros animals? Why was it done with 4-week animals (phenotype not complete? Event occurring before loss in numbers...)

      Response: As mentioned in the figure legend for Fig. 3F we have analysed N = 4 control and N= 5 KO mice. We chose the 4-week time-point as this was the stage when the loss of LCs first became apparent (even though non-significant at this age). We aimed to learn whether changes in nuclear morphology and nuclear envelope markers represented early molecular and cellular events following ArpC4 loss. Compared to later stages, this strategy poses a reduced risk to detect indirect effects of ArpC4 loss. We added a notion in the revised manuscript text to clarify this (page 5).

      Staining Lamin A/C globally more intense in the Arpc4-KO epidermis (also seems to apply to the masks corresponding to the LCs). Surprising to see that the quantification indicates a major drop of Lamin A/C intensity in the LCs.

      Response: We again thank the reviewer for this careful assessment. As with many tissue stainings, there is inter-sample variability. We have now revisited the micrographs and did not find a significant global reduction of Lamin A/C in the entire epidermis (including keratinocytes/KCs). The drop of Lamin A/C intensity is restricted to ArpC4KO LCs -and not KCs- and in line with the reduced Lamin A/C expression data in DCs (Fig. 3C,D). The revised manuscript now shows more representative examples (revised Fig. 3E).

      Legend Fig 4D replace confocal microscopy by STED microscopy

      Response: We replaced "confocal microscopy" by "STED microscopy".

      6- Figure 4F

      Intensity/background of γH2Ax staining very distinct between the 2 micrographs shown for WT and Arpc4-KO epidermis.

      Response: We revisited the micrographs and now selected more representative examples (revised Fig. 4I).

      7- Figure 7C, F, H

      Gating strategies: would be better to harmonize the style of the plots (dot plots and 2 types of contour plots have been used...)

      Response: We agree and provided a harmonized plot illustration in the revised manuscript (revised Fig. 7).

      8- Figure 7H

      Legend of lower gating strategy seems to be wrong (KO and not WT).

      Response: We thank the reviewer for pointing out this mistake. The revised Figure 7H shows a corrected figure display.

      Reviewer #1 (Significance (Required)):

      Strengths: the general quality of the manuscript is high. It is very clearly written and it contains a very detailed method section that would allow reproducing the reported experiments. This work entails a clear novelty in that it represents the first investigation of the role of ArpC4 in LCs. It opens an interesting perspective about specific mechanisms sustaining the maintenance of myeloid cell subsets in peripheral tissues. This work is therefore expected to be of interest for a large audience of cellular immunologists and beyond. Challenging skin function with an external trigger would lift the relevance for a even wider audience (see main point 6).

      __Response: __see main point 6.

      Limitations: in its current version the manuscript suffers from a lack of solidity around a few analysis (see main points on ArpC4 and Arp2/3 protein expression, nuclear envelop rupture analysis,...). It also tends to formulate a narrative centered on the ArpC4 intra-nuclear function that is not definitely proven.

      The field of expertise of this reviewer is: cellular immunology and actin remodeling.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SUMMARY This is a study in experimental mice employing both in vitro and, importantly, in vivo approaches. EPIDERMAL LANGERHANS CELLS serve as a paradigm for the maintenance of homeostasis of myeloid cells in a tissue, epidermis in this case. In addition to well known functions of the ACTIN NETWORK in cell migration, chemotaxis, cell adherence and phagocytosis the authors reveal a critical function of actin networks in the survival of cells in their home tissue.

      Actin-related proteins (Arp), specifically here the Arp2/3 complex, are necessary to form the filamentous actin networks. The authors use conditional knock-out mice where Arpc4 (an essential component of the Arp2/3 complex) is deleted under the control of CD11c, the most prominent dendritic cell marker which is also expressed on Langerhans cells. In normal mice, epidermal Langerhans cells reside in the epidermis virtually life-long. They initially settle the epidermis around and few days after birth an establish a dense network by a burst of proliferation and then they "linger on" by low level maintenance proliferation. In the epidermis of Arpc4 knock-out mice Langerhans cells also start off with this proliferative burst but, strikingly, they do not stay but are massively reduced by the age of 8-12 weeks.

      The analyses of this decline revealed that

      -- the shape (number of nuclear lobes) and integrity of cell nuclei was compromised; they were fragile and ruptured to some degree when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      -- DNA damage, as detected by staining for gamma-H2Ax or 53BP1 accumulated when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      -- recruitment of DNA repair molecules was inhibited when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      -- gene signatures of interferon signaling and response were increased when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      -- in vivo migration of dendritic cells and Langerhans cells from the skin to the draining lymph nodes in an inflammatory setting (FITC painting of the skin) was impaired when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      -- the persistence of the typical dense network of Langerhans cells in the epidermis, created by proliferation shortly after birth, is abrogated when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing. Importantly, this was not the case for myeloid cell populations that settle a tissue without needing that initial burst of proliferation. For instance, numbers of colonic macrophages were not affected when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing.

      Thus, the authors conclude that the Arp2/3 complex is essential by its formation of actin networks to maintain the integrity of nuclei and ensure DNA repair thereby ascertaining the maintenance proliferation of Langerhans cells and, as the consequence, the persistence of the dense epidermal netowrk of Langerhans cells.

      Up-to-date methodology from the fields of cell biology and cellular immunology (cell isolation from tissues, immunofluorescence, multiparameter flow cytometry, FISH, "good old" - but important - transmission electron microscopy, etc.) was used at high quality (e.g., immunofluorescence pictures!). Quantitative and qualitative analytical methods were timely and appropriate (e.g., Voronoi diagrams, cell shape profiling tools, Cre-lox gene-deletion technology, etc.). Importantly, the authors used a clever method, that they had developed several years ago, namely the analysis of dendritic cell migration in microchannels of defined widths. Molecular biology methods such as RNAseq were also employed and analysed by appropriate bioinformatic tools.

      MAJOR COMMENTS:

      • ARE THE KEY CONCLUSIONS CONVINCING? Yes, they are.

      • SHOULD THE AUTHORS QUALIFY SOME OF THEIR CLAIMS AS PRELIMINARY OR SPECULATIVE, OR REMOVE THEM ALTOGETHER? No, I think it is ok as it stands. The authors are wording their claims and conclusions not apodictically but cautiously, as it should be. They point out explicitely which lines of investigations they did not follow up here.

      • WOULD ADDITIONAL EXPERIMENTS BE ESSENTIAL TO SUPPORT THE CLAIMS OF THE PAPER? REQUEST ADDITIONAL EXPERIMENTS ONLY WHERE NECESSARY FOR THE PAPER AS IT IS, AND DO NOT ASK AUTHORS TO OPEN NEW LINES OF EXPERIMENTATION. I think that the here presented experimental evidence suffices to support the conclusions drawn. No additional experiments are necessary.

      • ARE THE SUGGESTED EXPERIMENTS REALISTIC IN TERMS OF TIME AND RESOURCES? IT WOULD HELP IF YOU COULD ADD AN ESTIMATED COST AND TIME INVESTMENT FOR SUBSTANTIAL EXPERIMENTS. Not applicable.

      • ARE THE DATA AND THE METHODS PRESENTED IN SUCH A WAY THAT THEY CAN BE REPRODUCED? Yes, they are.

      • ARE THE EXPERIMENTS ADEQUATELY REPLICATED AND STATISTICAL ANALYSIS ADEQUATE? Yes.

      __Response: __We thank the reviewer very much for assessing our work, for providing constructive suggestions, and for acknowledging the strength of the study.

      MINOR COMMENTS:

      • SPECIFIC EXPERIMENTAL ISSUES THAT ARE EASILY ADDRESSABLE. None

      • ARE PRIOR STUDIES REFERENCED APPROPRIATELY? Essentially yes. Regarding the reduction / loss of the adult epidermal Langerhans cell network, it may be of some interest to also refer to / discuss to another one of the few examples of this phenomenon. There, the initial burst of proliferation is followed by reduced proliferation and increased apoptosis when a critical member of the mTOR signaling cascade is conditionally knocked out (Blood 123:217, 2014).

      Response: We thank the reviewer for pointing out this important work. We now included the paper into the revised manuscript (page 12).

      • ARE THE TEXT AND FIGURES CLEAR AND ACCURATE? Yes they are. Figures are well arranged for easy comprehension.

      • DO YOU HAVE SUGGESTIONS THAT WOULD HELP THE AUTHORS IMPROVE THE PRESENTATION OF THEIR DATA AND CONCLUSIONS?

      1. Materials & Methods. The authors write, regarding flow cytometry of epidermal cells: "Briefly, 1cm2 of back skin from 8-14 weeks old female wild-type and knockout littermates was dissociated in 0.25 mg/mL Liberase (Sigma, cat. #5401020001) and 0.5 mg/mL DNase (Sigma, cat.#10104159001) in 1 mL of RPMI (Sigma) and mechanically disaggregated in Eppendorf tubes, FOLLOWED BY INCUBATED for 2 h at 37 {degree sign}C." Followed by what?

      __Response: __We apologize for this mistake. The text should read: "... followed by incubation for 2 h at 37 {degree sign}C and filtration using a 100µm cell strainer. Thereafter, blocking was performed in PBS supplemented with 0.5% bovine serum albumin and 2 mM EDTA at 4 {degree sign}C, followed by antibody labeling of cells in single cell suspension". The text has been corrected in the revised manuscript (page 17).

      Materials & Methods. BMDC electronmicroscopy. What is "IF". Please specify.

      __Response: __We also regret this mistake in the method text. It should read: "... For electron microscopy analysis, after PDMS removal, cells were fixed using 2.5% glutaraldehyde ...". The text has been corrected in the revised manuscript (page 21).

      RESULTS in gene expression analyses. The authors observe some increase in apoptosis (as detected by cleaved-Caspase-3 staining). Is this observation in immunofluorescence also evident in the RNAseq data (where the IFN changes were seen), i.e., in Figure 5.

      __Response: __We have checked our RNAseq data regarding any changes in apoptosis-related genes, however, apart from a few transcripts that are upregulated in ArpC4KO cells, we do not find a pronounced enrichment of apoptosis-related genes. We included volcano plot data in revised Suppl. Fig. 5H to share these DEGs.

      Figure 7 F and G. Perhaps the authors may want to swap upper and lower panels in F or G, so that macrophage FACS plots and bar graphs are in the same row - ob, obiously, DC plots and bars likewise.

      __Response: __We agree and have harmonized the panel sequence in the revised manuscript (revised Fig. 7F, G; panels swapped in G, display harmonized).

      Figure 7H. "Gating strategy in ArpC4WT Lung (previously gated in Live CD45+ cells)" - The lower row is knock-out, not WT. This is indicated correctly in the legand, but in the figure both rows are labeled as WT.

      __Response: __Indeed, the legend information is correct, but the corresponding figure panel is incorrect. We now provide a corrected version (revised Fig. 7H).

      The reference by Park et al. 2021 is missing in the list.

      __Response: __We have added the reference to the revised bibliography.

      Figure 1D. Sure, the bar graphs are meant to say "CD11c"? The FACS plots show "CD11b".

      __Response: __We have checked the panels and corrected where necessary (revised fig. 1D).

      As to cDC1. In Figure 1D the FACS plot shows an absence of CD103+ cDC1 cells. In contrast, In Figure 7A-left side panel, there is not difference in cDC1 cells between WT and KO mice. Is therefore the flow cytometry plot in Figure 1D not representative regarding cDC1 cells? Correct?

      __Response: __The reviewer is correct about this apparent discrepancy. We have not observed differences in the control vs. ArpC4KO cDC1 population, hence Figure 7 represents our findings. For figure 1, we have by mistake chosen a non-representative plot, with the aim of illustrating the gating strategy. We apologize for this mistake and now provide a corrected and representative FACS plot figure in the revised manuscript (revised Fig. 1D).

      Reviewer #2 (Significance (Required)):

      • DESCRIBE THE NATURE AND SIGNIFICANCE OF THE ADVANCE (E.G. CONCEPTUAL, TECHNICAL, CLINICAL) FOR THE FIELD. This is a conceptual advance. It adds a big step to our understanding of how immune cells in tissues (which all come from the bone marrow or are seeded before birth from embryonal hematopoietic organs such as yolk sac and fetal liver) can remain resident in these tissues. For cell types such as Langerhans cells, which establish their final population density within their tissues of residence, the presented finding convincingly buttress the role of proliferation and thereby the role for the actin-related protein complex 2/3 (Arp2/3).

      • PLACE THE WORK IN THE CONTEXT OF THE EXISTING LITERATURE (PROVIDE REFERENCES, WHERE APPROPRIATE). While we know much about actin-related proteins (Arp), as correctly cited by the authors, this knowledge is derived mostly from in vitro studies. The submitted study translates the findings to an in vivo setting for the first time.

      • STATE WHAT AUDIENCE MIGHT BE INTERESTED IN AND INFLUENCED BY THE REPORTED FINDINGS. Skin immunologists foremost, but these findings are of interest to the entire community of immunologists, but also cell biologists.

      • DEFINE YOUR FIELD OF EXPERTISE. My expertise is in skin immunology, in particular skin dendritic cells including Langerhans cells.

      We acknowledge the referee for their positive assessment of our manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript identifies a role of the Arp2/3 complex, the major regulator of actin branching in cells, for controlling the homeostasis of murine Langerhans cells (LCs), a specialized subset of dendritic cells in the skin epidermis. The findings of the study are based on the analysis of CD11c-Cre Arpc4-flox mice, a conditional knockout mouse model, which interferes with Arp2/3 function in Langerhans cells and other CD11c-expressing myeloid cells, e.g. dendritic cell or macrophage subsets. By using immunofluorescence and flow cytometry analysis of epidermis and skin tissues, the authors provide a detailed analysis of LC numbers at different developmental stages (postnatal day 1, 7, 28, and adult mice) and demonstrate that Arpc4-deficiency does not interfere with the establishment of LC networks until postnatal day 28. However, LCs in ear and tail skin are substantially reduced in Arpc4-deficient mice at 8-12 weeks of age. In parallel to their in vivo model, the authors analyze cultures of bone marrow-derived dendritic cells (BMDCs) from control and CD11c-Cre Arpc4-flox mice. Arpc4-deficiency in BMDCs, which develop over 8-10 days in culture, results in nuclear shape and lamina abnormalities, as well as signs of increased DNA damage. Aspects of this phenotype are also detected in Langerhans cells in epidermal preparations. Transcriptomic analysis of BMDCs highlights a gene signature of increased expression of the interferon response pathway and alterations in cell cycle regulation. Arpc4-deficient BMDCs show increased expression of DNA damage markers and reduced expression of certain DNA repair factors. Based on these correlative findings from the BMDC model, the authors conclude that the decline in LC numbers might develop from the accumulation of DNA damage over time, which the authors phrease "pre-mature aging of Langerhans cells". Lastly, the authors show a heterogenous picture how Arp2/3 depletion affects distinct DC populations in CD11c-Cre Arpc4-flox mice. While some tissue-resident DC subsets appear normal in numbers, others are declined in numbers in the tissue. This may be related to their proliferation potential in tissues.

      Major comments:

      • Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      1) The authors claim that Arpc4 deficiency selectively compromises myeloid cell populations that rely on proliferation for tissue colonization (Figure 7). The presented data might give hints for such a general hypothesis, but solid experimental proof to prove this is lacking. When comparing myeloid cell subsets from foru different irgans, the authors refer to published data that some dendritic cell subsets are more proliferative in tissues than others and that CD11cCre Arpc4-flox mice appear to have reduced cell numbers in these populations. However, the presented data are purely correlative and no functional connection to cell proliferation has been made to the phenotypes. While some dendritic cell subsets (Langerhans cells, alveolar DCs) show reduced cell numbers in CD11cCre Arpc4-flox mice, other myeloid cell cells subsets are unaffected (e.g. dermal cDC1 and 2, colon macrophages).There could be plenty of other reasons that might underly the observed discrepancies between these cell subsets, e.g. Arp2/3 knockout efficiency and myeloid cell turnover in the tissue are just two examples, which have not been taken into consideration. Direct measurement of cell proliferation, e.g. BrdU labeling, and the observed phenotype would be missing to make such claims. The data could either be removed. Experimentally addressing these points could take 3-6 months.

      Response and revisions: We thank the referee for bringing this point. We agree that these results give hints that support our conclusion but that do not address this question directly. However, we would like to emphasize that our conclusion is based on studies from others showing that alveolar macrophages self-maintain themselves through proliferation (Bain et al. Mucosal Immunology 2022). In contrast, it has been reported that a large fraction of colonic macrophages are derived from monocytes that are being recruited to the gut through life (Bain et al. Mucosal Immunity 2023). We now added these points in our revised manuscript. Moreover, during revision we confirmed deletion of the ArpC4 allele by genotyping PCR of FACsorted colon macrophages (revised Suppl. Fig. 7C and revised methods). In addition, we stress that we do not exclude that different intracellular Arpc4-dependent processes might contribute to the phenotypes observed (beyond maintenance of DNA integrity) (page 11). This will help mitigate our conclusions and leave open the potential implication of alternative mechanisms.

      2) The authors claim that DC subsets (e.g. dermal cDCs), which develop from pre-DCs, are not affected by Arp2/3 depletion (Figure 7, although the FACS plot in Fig. 1D would suggest a different picture for cDC1). This is surprising in light of the data with bone marrow-derived DCs (BMDCs), the major in vitro model of this study, which develop from CDPs that again develop from pre-DCs. BMDCs did show aberrant nuclei and signs of DNA damage. How would the authors then explain the discrepancies of the BMDC model with DC subsets, where the authors feel that the pre-DC origin explains the phenotypic difference? This is a general concern of the data interpretation and conclusions.

      __Response: __We thank the referee for bringing this point that indeed requires clarification. Two non-exclusive hypotheses could explain this apparent discrepancy:

      • The ontogeny of bone-marrow-derived DCs: Depending on the protocol used, there might be variations in the precursors DCs develop from. We use one of the first protocols, which was pioneered by Paola Ricciardi-Castagnoli lab (Winzler et al. Exp.Med. 1997). It relies on a supernatant from J558 cells transfected with GMCSF, which contains additional cytokines and mainly generate DC2-like DCs. Langerhans cells are closer to DC2s, which resemble more macrophages than DC1s. We thus chose this protocol rather than the protocols that use Flt3-L, which produce both DC1s and DC2s developed from common dendritic-cell precursors (CDPs). It is thus possible that our BM-derived DCs develop from other precursor cells closer to monocyte precursors.
      • As shown in Figure 5C, kinetics of acquisition of CD11c expression, and thus deletion of the Arpc4 gene, might be distinct in vivo and in vitro. In vivo, as stated in our manuscript, DCs acquire CD11c as preDCs and undergo few rounds of divisions after. In vitro, as shown by our cycling experiments, BM-derived DCs continuously cycle, so they will keep dividing after having acquired CD11c (around day 7) and deleting the Arpc4 gene. We now mentioned these hypotheses in the discussion of our revised manuscript to explain the apparent contradiction raised by the referee (pages 10 and 12).

      3) In line with point 2, the authors never show that BMDCs show reduced proliferation, reduced cell numbers or increased cell death in Arpc4-deficient cell cultures, as a consequence of the detected DNA damage and impaired DNA repair. In fact, Figure 5C even shows that cell growth rates between control and KO are equal. This is a major mismatch in the current study. Since the authors use the BMDC model to explain the declining cell numbers in Langerhans cells (which derive from fetal liver cells), this phenotype is not mirrored by the BMDC culture and it remains open whether the observed changes in nuclear DNA damage and repair are indeed directly linked to the observed phenotype of declining cell numbers in the tissue. These aspects require argumentation why cell growth is unchanged in KO cells. Additional experiments addressing these points with sufficient biological replicates (cultures from different mice) could take 2-3 months, including preparation time.

      __Response____: __We thank the referee for bringing this point, which was probably not properly discussed in the first version of our manuscript. Indeed, Arpc4KO BM-derived DCs do not show the premature cell death phenotype observed in LCs in vivo, as stated by the referee. There are at least two putative non-exclusive explanations for this. First, unlike LCs, which are long-lived cells, BM-derived DCs can be kept in culture for only 10-12 days. As DNA damage-induced cell death takes time (LCs only start to die about 3-4 weeks after network establishment), the lifespan of BM-DCs could simply not be long enough to observe this phenotype. Second, in the epidermis, LCs are physically constrained and continuously exposed to diverse signals that might increase their sensitivity to DNA damage and thereby induction of subsequent cell death.

      We have attempted to clarify this point in our revised manuscript by providing putative explanations for the death phenotype of Arpc4-deficient LCs not being observed in BM-derived DCs. We further explained that this does not invalidate this cellular model as it was used to raise hypotheses on the putative role played by ArpC4 in myeloid cells, i.e. maintenance of DNA integrity, which was then confirmed in vivo (ArpC4KO LCs do indeed display DNA damage in the epidermis) (page 12). Without this "imperfect cellular model", we would have probably not been able to uncover this novel function of Arp2/3 in immune cells.

      4) The authors refer to a "pre-mature aging" phenotype of Arpc4-deficient BMDCs and LCs, based on reductions in Lamin B, Lamin A and increases in gH2AX and 53BP1. I find this term and overstatement of the current data and suggest that other markers for cell senescence, such as p53, Rb, p21 and b-Galactosidase are then also used to make such strong claim on "aging" and cell senescence. Experimentally addressing this point with sufficient biological replicates could take 2-3 months, including preparation time.

      __Response: __We now assessed senescence signatures in our RNAseq analysis of Arpc4WT and Arpc4KO-derived DCs, as suggested by the referee. These results revealed several senescence-related DEGs upregulated in ArpC4KO DCs, such as serpinB2 (revised Suppl. Fig. 5G, volcano plots) as well as a general enrichment of a senescence-related signature when using the senescence gene set (Aging Atlas Consortium, 2021; revised Fig. 5I). These data support our notion of a premature aging phenotype following ArpC4 loss in BMDCs.

      5) The study does not provide a mechanism how the Arp2/3 complex would mediate the observed effects on DNA damage and repairs has not been addressed in the cell model, and only potential scenarios from other non-myeloid cell lines are discussed. It remains unclear whether the observed phenotypes in Arpc4-depleted myleoid cells relate to the direct nuclear function of Arp2/3 or the cytosolic function of Arp2/3, including its roles in cytoskeletal regulation that may have secondary effects on the nuclear alterations. This is a general concern of the presented data, data on mechanism might require more than 6 months.

      __Response____: __The referee is correct: Our manuscript shows that Arp2/3 deficiency in specific myeloid cells impacts on their survival in vivo and proposes that this could result at least in part from impaired maintenance of DNA integrity in these cells. We do not know whether this also applies to non-myeloid cells, which, although very interesting, is beyond the scope of the present study. In addition, we do not have any experimental tool to distinguish whether the DNA damage phenotype of Arpc4KO cells involves the nuclear or cortical pool of F-actin, this is why we have left this question open in the discussion of our manuscript.

      6) OPTIONAL: The authors make a strong case arguing that the increased interferon expression signature (based on the transcriptomics data) reflects the nuclear ruptures in Arpc4-deficient cells and adds to the observed phenotype. If this is so, what happens then in STING knockout cells in the presence of CK666 inhibitor?

      __Response____: __During revision, we now tested the putative role of STING in the ArpC4-KO phenotype. We found that abrogation of STING function in ArpC4KO mice did not rescue LC survival, excluding the possibility that aberrant STING activation triggers LC loss in these animals (revised Fig. S5E,F). Therefore, we tempered our conclusion (page 7).

      • Are the data and the methods presented in such a way that they can be reproduced?

      1) The analyses include quite a number of intensity calculations of immunofluorescence signals (Fig. 3D, E; Fig. 4E, Fig. 5B and 6B)? The background stainings are often variable or very high. In some cases it is even unclear whether stainings are really detecting protein and go beyond background staining (Fig. 6A, Fig. 5F). How were immunofluorescence data acquired and dealt with different background staining intensities?

      __Response____: __We extended our description of the microscopes used for image acquisition as well as the downstream analyses for each experiment, which indeed vary depending on the signals observed with distinct antibodies or constructs.

      2) It remained unclear to me on which basis the nuclear deformations in Fig. 3G, H were calculated?

      __Response____: __We also extended the mentioning of methods used to quantify nuclear deformations.

      3) The detailed phenotype of control mice is a bit unclear. It appears as if these were Cre-negative animals. Did the authors have some proof-of-principle experiments showing that CD11cCre Arpc4 +/+ animals have comparable phenotypes to Cre-negative animals?

      __Response____: __We have never observed any decline in LC numbers in other mouse lines/genotypes (for example in cPLA2flox/flox;CD11c-Cre mice shown in the manuscript, Fig. S6B), excluding a putative role for the Cre in LC death. To nevertheless thoroughly check this aspect, we now quantified gH2Ax immunostaining of LCs of both Cre-positive and Cre-negative animals. These analyses revealed no Cre-mediated effect on DNA damage in LCs (revised Suppl. Fig. 4E,F).

      • Are the experiments adequately replicated and statistical analysis adequate?

      For most experiments, the number of biological replicates (mice, or BMDC cultures from different mice) and individual values (n, cells) are indicated. Statistical analysis appears adequate.

      Minor comments:

      • Prior published studies on Arp2/3 function in immune cells are referenced accordingly. A number of additional pre-print manuscripts on this topic have not been cited and could be considered referencing.

      __Response: __We now cited additional, relevant preprints and peer-reviewed work (page 12).

      • The text is very clearly and very well written. Figures are clear and accurate for most cases. There are some open questions:

      • Fig. 1B: The number of dots betwenn graph and legend do not match. The dots are not n=12 for both genotypes. Additionally: What do the symbols in the circles in the graph stand for? This is also in another later figure unclear.

      • Fig. 2C: The current IF presentation (overlay MHCII with Ki67) is not very helpful. An additional image that shows only the Ki67 signal in the MHCII mask would be very helpful.

      • Fig. 4B: BMDCs of which culture day were used for these experiments?

      • Fig. 4A and D shows the same representative cells for two biological messages, which is only moderately convincing regarding a "general" phenotype.

      • Fig. 5, B: Scale bars are missing.

      __Response: __We have fixed all these points (revised Fig. 1B, 2C, 4B, 4A&D, 5B).

      Reviewer #3 (Significance (Required)):

      Strengths and Advance:

      The study provides strong data and a very detailed analysis of how the Arp2/3 complex regulates stages of Langerhans cell development and homeostasis. The role of the Arp2/3 complex as regulator of actin branching, which is involved in many cellular functions, has previously not been reported for this cell type. Previous research in immune cells have already studied the Arp2/3 complex, but studies were focussed on its role in migration and the majority of published phenotypes related to cell migration. While there are already a number of in vitro studies showing that the Arp2/3 complex can regulate aspects of cell cycle control or cell death in non-immune cells, most of these studies were performed with immortalized, non-immune cell lines, which can be more easily manipulated to dissect mechanistic aspects of the cellular phenotype, but are limited in their physiological interpretation. Hence, it is a major strength of this study to investigate the effects of Arp2/3 in a primary immune cell type, directly in the native and physiological environment. This is important because in vitro data from other cell types cannot be easily extrapolated to any other cell type and it is critical for our understanding to collect physiological data from tissues, where the biology really happens. The finding that the Arp2/3 complex regulates the tissue-residency of Langerhans cell through processes that are unrelated to migration are partially unexpected, shifting the view of this protein complex's physiological role to other cell biological processes, e.g. regulation of cell proliferation.

      Limitations: The limitations of the study are detailed in the five major points listed above. The study accumulates many experiments that characterize the phenotype of Arpc4-depleted cells, showing signs of DNA damage in Langerhans cells and cultures of BMDCs. How the Arp2/3 complex would mechanistically mediate the observed effects on DNA damage and repairs have not been addressed. It also remains open whether this is due to the effects of the Arp2/3 complex in the nucleus or the cytosol, which would be biologically extremely important to understand. Above that, there are some discrepancies regarding the phenotype of the BMDC model, which does neither entirely match the Langerhans cell phenotype in the tissue (reduced proliferation, LC derive from different progenitors), nor other endogenous DC populations, which should also derive from similar progenitors.

      Audience and reviewer background:

      In its current form, the manuscript will already be of interest for several research fields: Langerhans cell and dendritic cell homeostasis, immune cell trafficking, actin and cytoskeleton regulation in immune cells, physiological role of actin-regulating proteins. My own field of expertise is immune cell trafficking in mouse models, leukocyte migration and cytoskeletal regulation. I cannot judge the analysis and clustering of the bulk RNA sequencing data.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript identifies a role of the Arp2/3 complex, the major regulator of actin branching in cells, for controlling the homeostasis of murine Langerhans cells (LCs), a specialized subset of dendritic cells in the skin epidermis. The findings of the study are based on the analysis of CD11c-Cre Arpc4-flox mice, a conditional knockout mouse model, which interferes with Arp2/3 function in Langerhans cells and other CD11c-expressing myeloid cells, e.g. dendritic cell or macrophage subsets. By using immunofluorescence and flow cytometry analysis of epidermis and skin tissues, the authors provide a detailed analysis of LC numbers at different developmental stages (postnatal day 1, 7, 28, and adult mice) and demonstrate that Arpc4-deficiency does not interfere with the establishment of LC networks until postnatal day 28. However, LCs in ear and tail skin are substantially reduced in Arpc4-deficient mice at 8-12 weeks of age. In parallel to their in vivo model, the authors analyze cultures of bone marrow-derived dendritic cells (BMDCs) from control and CD11c-Cre Arpc4-flox mice. Arpc4-deficiency in BMDCs, which develop over 8-10 days in culture, results in nuclear shape and lamina abnormalities, as well as signs of increased DNA damage. Aspects of this phenotype are also detected in Langerhans cells in epidermal preparations. Transcriptomic analysis of BMDCs highlights a gene signature of increased expression of the interferon response pathway and alterations in cell cycle regulation. Arpc4-deficient BMDCs show increased expression of DNA damage markers and reduced expression of certain DNA repair factors. Based on these correlative findings from the BMDC model, the authors conclude that the decline in LC numbers might develop from the accumulation of DNA damage over time, which the authors phrease "pre-mature aging of Langerhans cells". Lastly, the authors show a heterogenous picture how Arp2/3 depletion affects distinct DC populations in CD11c-Cre Arpc4-flox mice. While some tissue-resident DC subsets appear normal in numbers, others are declined in numbers in the tissue. This may be related to their proliferation potential in tissues.

      Major comments:

      • Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      1) The authors claim that Arpc4 deficiency selectively compromises myeloid cell populations that rely on proliferation for tissue colonization (Figure 7). The presented data might give hints for such a general hypothesis, but solid experimental proof to prove this is lacking. When comparing myeloid cell subsets from foru different irgans, the authors refer to published data that some dendritic cell subsets are more proliferative in tissues than others and that CD11cCre Arpc4-flox mice appear to have reduced cell numbers in these populations. However, the presented data are purely correlative and no functional connection to cell proliferation has been made to the phenotypes. While some dendritic cell subsets (Langerhans cells, alveolar DCs) show reduced cell numbers in CD11cCre Arpc4-flox mice, other myeloid cell cells subsets are unaffected (e.g. dermal cDC1 and 2, colon macrophages).There could be plenty of other reasons that might underly the observed discrepancies between these cell subsets, e.g. Arp2/3 knockout efficiency and myeloid cell turnover in the tissue are just two examples, which have not been taken into consideration. Direct measurement of cell proliferation, e.g. BrdU labeling, and the observed phenotype would be missing to make such claims. The data could either be removed. Experimentally addressing these points could take 3-6 months.

      2) The authors claim that DC subsets (e.g. dermal cDCs), which develop from pre-DCs, are not affected by Arp2/3 depletion (Figure 7, although the FACS plot in Fig. 1D would suggest a different picture for cDC1). This is surprising in light of the data with bone marrow-derived DCs (BMDCs), the major in vitro model of this study, which develop from CDPs that again develop from pre-DCs. BMDCs did show aberrant nuclei and signs of DNA damage. How would the authors then explain the discrepancies of the BMDC model with DC subsets, where the authors feel that the pre-DC origin explains the phenotypic difference? This is a general concern of the data interpretation and conclusions.

      3) In line with point 2, the authors never show that BMDCs show reduced proliferation, reduced cell numbers or increased cell death in Arpc4-deficient cell cultures, as a consequence of the detected DNA damage and impaired DNA repair. In fact, Figure 5C even shows that cell growth rates between control and KO are equal. This is a major mismatch in the current study. Since the authors use the BMDC model to explain the declining cell numbers in Langerhans cells (which derive from fetal liver cells), this phenotype is not mirrored by the BMDC culture and it remains open whether the observed changes in nuclear DNA damage and repair are indeed directly linked to the observed phenotype of declining cell numbers in the tissue. These aspects require argumentation why cell growth is unchanged in KO cells. Additional experiments addressing these points with sufficient biological replicates (cultures from different mice) could take 2-3 months, including preparation time.

      4) The authors refer to a "pre-mature aging" phenotype of Arpc4-deficient BMDCs and LCs, based on reductions in Lamin B, Lamin A and increases in gH2AX and 53BP1. I find this term and overstatement of the current data and suggest that other markers for cell senescence, such as p53, Rb, p21 and b-Galactosidase are then also used to make such strong claim on "aging" and cell senescence. Experimentally addressing this point with sufficient biological replicates could take 2-3 months, including preparation time.

      5) The study does not provide a mechanism how the Arp2/3 complex would mediate the observed effects on DNA damage and repairs has not been addressed in the cell model, and only potential scenarios from other non-myeloid cell lines are discussed. It remains unclear whether the observed phenotypes in Arpc4-depleted myleoid cells relate to the direct nuclear function of Arp2/3 or the cytosolic function of Arp2/3, including its roles in cytoskeletal regulation that may have secondary effects on the nuclear alterations. This is a general concern of the presented data, data on mechanism might require more than 6 months.

      6) OPTIONAL: The authors make a strong case arguing that the increased interferon expression signature (based on the transcriptomics data) reflects the nuclear ruptures in Arpc4-deficient cells and adds to the observed phenotype. If this is so, what happens then in STING knockout cells in the presence of CK666 inhibitor?

      • Are the data and the methods presented in such a way that they can be reproduced?

      1) The analyses include quite a number of intensity calculations of immunofluorescence signals (Fig. 3D, E; Fig. 4E, Fig. 5B and 6B)? The background stainings are often variable or very high. In some cases it is even unclear whether stainings are really detecting protein and go beyond background staining (Fig. 6A, Fig. 5F). How were immunofluorescence data acquired and dealt with different background staining intensities?

      2) It remained unclear to me on which basis the nuclear deformations in Fig. 3G, H were calculated?

      3) The detailed phenotype of control mice is a bit unclear. It appears as if these were Cre-negative animals. Did the authors have some proof-of-principle experiments showing that CD11cCre Arpc4 +/+ animals have comparable phenotypes to Cre-negative animals?

      • Are the experiments adequately replicated and statistical analysis adequate?

      For most experiments, the number of biological replicates (mice, or BMDC cultures from different mice) and individual values (n, cells) are indicated. Statistical analysis appears adequate.

      Minor comments:

      • Prior published studies on Arp2/3 function in immune cells are referenced accordingly. A number of additional pre-print manuscripts on this topic have not been cited and could be considered referencing.

      • The text is very clearly and very well written. Figures are clear and accurate for most cases. There are some open questions:

      1) Fig. 1B: The number of dots betwenn graph and legend do not match. The dots are not n=12 for both genotypes. Additionally: What do the symbols in the circles in the graph stand for? This is also in another later figure unclear.

      2) Fig. 2C: The current IF presentation (overlay MHCII with Ki67) is not very helpful. An additional image that shows only the Ki67 signal in the MHCII mask would be very helpful.

      3) Fig. 4B: BMDCs of which culture day were used for these experiments?

      4) Fig. 4A and D shows the same representative cells for two biological messages, which is only moderately convincing regarding a "general" phenotype.

      5) Fig. 5, B: Scale bars are missing.

      Significance

      Strengths and Advance:

      The study provides strong data and a very detailed analysis of how the Arp2/3 complex regulates stages of Langerhans cell development and homeostasis. The role of the Arp2/3 complex as regulator of actin branching, which is involved in many cellular functions, has previously not been reported for this cell type. Previous research in immune cells have already studied the Arp2/3 complex, but studies were focussed on its role in migration and the majority of published phenotypes related to cell migration. While there are already a number of in vitro studies showing that the Arp2/3 complex can regulate aspects of cell cycle control or cell death in non-immune cells, most of these studies were performed with immortalized, non-immune cell lines, which can be more easily manipulated to dissect mechanistic aspects of the cellular phenotype, but are limited in their physiological interpretation. Hence, it is a major strength of this study to investigate the effects of Arp2/3 in a primary immune cell type, directly in the native and physiological environment. This is important because in vitro data from other cell types cannot be easily extrapolated to any other cell type and it is critical for our understanding to collect physiological data from tissues, where the biology really happens. The finding that the Arp2/3 complex regulates the tissue-residency of Langerhans cell through processes that are unrelated to migration are partially unexpected, shifting the view of this protein complex's physiological role to other cell biological processes, e.g. regulation of cell proliferation.

      Limitations:

      The limitations of the study are detailed in the five major points listed above. The study accumulates many experiments that characterize the phenotype of Arpc4-depleted cells, showing signs of DNA damage in Langerhans cells and cultures of BMDCs. How the Arp2/3 complex would mechanistically mediate the observed effects on DNA damage and repairs have not been addressed. It also remains open whether this is due to the effects of the Arp2/3 complex in the nucleus or the cytosol, which would be biologically extremely important to understand. Above that, there are some discrepancies regarding the phenotype of the BMDC model, which does neither entirely match the Langerhans cell phenotype in the tissue (reduced proliferation, LC derive from different progenitors), nor other endogenous DC populations, which should also derive from similar progenitors.

      Audience and reviewer background:

      In its current form, the manuscript will already be of interest for several research fields: Langerhans cell and dendritic cell homeostasis, immune cell trafficking, actin and cytoskeleton regulation in immune cells, physiological role of actin-regulating proteins. My own field of expertise is immune cell trafficking in mouse models, leukocyte migration and cytoskeletal regulation. I cannot judge the analysis and clustering of the bulk RNA sequencing data.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      • This is a study in experimental mice employing both in vitro and, importantly, in vivo approaches. EPIDERMAL LANGERHANS CELLS serve as a paradigm for the maintenance of homeostasis of myeloid cells in a tissue, epidermis in this case. In addition to well known functions of the ACTIN NETWORK in cell migration, chemotaxis, cell adherence and phagocytosis the authors reveal a critical function of actin networks in the survival of cells in their home tissue.

      • Actin-related proteins (Arp), specifically here the Arp2/3 complex, are necessary to form the filamentous actin networks. The authors use conditional knock-out mice where Arpc4 (an essential component of the Arp2/3 complex) is deleted under the control of CD11c, the most prominent dendritic cell marker which is also expressed on Langerhans cells. In normal mice, epidermal Langerhans cells reside in the epidermis virtually life-long. They initially settle the epidermis around and few days after birth an establish a dense network by a burst of proliferation and then they "linger on" by low level maintenance proliferation. In the epidermis of Arpc4 knock-out mice Langerhans cells also start off with this proliferative burst but, strikingly, they do not stay but are massively reduced by the age of 8-12 weeks.

      • The analyses of this decline revealed that

      a) the shape (number of nuclear lobes) and integrity of cell nuclei was compromised; they were fragile and ruptured to some degree when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      b) DNA damage, as detected by staining for gamma-H2Ax or 53BP1 accumulated when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      c) recruitment of DNA repair molecules was inhibited when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      d) gene signatures of interferon signaling and response were increased when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      e) in vivo migration of dendritic cells and Langerhans cells from the skin to the draining lymph nodes in an inflammatory setting (FITC painting of the skin) was impaired when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing;

      f) the persistence of the typical dense network of Langerhans cells in the epidermis, created by proliferation shortly after birth, is abrogated when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing. Importantly, this was not the case for myeloid cell populations that settle a tissue without needing that initial burst of proliferation. For instance, numbers of colonic macrophages were not affected when Arpc4 was knocked out, i.e., the Arp2/3 complex was missing.

      • Thus, the authors conclude that the Arp2/3 complex is essential by its formation of actin networks to maintain the integrity of nuclei and ensure DNA repair thereby ascertaining the maintenance proliferation of Langerhans cells and, as the consequence, the persistence of the dense epidermal netowrk of Langerhans cells.

      • Up-to-date methodology from the fields of cell biology and cellular immunology (cell isolation from tissues, immunofluorescence, multiparameter flow cytometry, FISH, "good old" - but important - transmission electronmicroscopy, etc.) was used at high quality (e.g., immunofluorescence pictures!). Quantitative and qualitative analytical methods were timely and appropriate (e.g., Voronoi diagrams, cell shape profiling tools, Cre-lox gene-deletion technology, etc.). Importantly, the authors used a clever method, that they had developed several years ago, namely the analysis of dendritic cell migration in microchannels of defined widths. Molecular biology methods such as RNAseq were also employed and analysed by appropriate bioinformatic tools.

      Major comments:

      • ARE THE KEY CONCLUSIONS CONVINCING? Yes, they are.

      • SHOULD THE AUTHORS QUALIFY SOME OF THEIR CLAIMS AS PRELIMINARY OR SPECULATIVE, OR REMOVE THEM ALTOGETHER? No, I think it is ok as it stands. The authors are wording their claims and conclusions not apodictically but cautiously, as it should be. They point out explicitely which lines of investigations they did not follow up here.

      • WOULD ADDITIONAL EXPERIMENTS BE ESSENTIAL TO SUPPORT THE CLAIMS OF THE PAPER? REQUEST ADDITIONAL EXPERIMENTS ONLY WHERE NECESSARY FOR THE PAPER AS IT IS, AND DO NOT ASK AUTHORS TO OPEN NEW LINES OF EXPERIMENTATION. I think that the here presented experimental evidence suffices to support the conclusions drawn. No additional experiments are necessary.

      • ARE THE SUGGESTED EXPERIMENTS REALISTIC IN TERMS OF TIME AND RESOURCES? IT WOULD HELP IF YOU COULD ADD AN ESTIMATED COST AND TIME INVESTMENT FOR SUBSTANTIAL EXPERIMENTS. Not applicable.

      • ARE THE DATA AND THE METHODS PRESENTED IN SUCH A WAY THAT THEY CAN BE REPRODUCED? Yes, they are.

      • ARE THE EXPERIMENTS ADEQUATELY REPLICATED AND STATISTICAL ANALYSIS ADEQUATE? Yes.

      Minor comments:

      • SPECIFIC EXPERIMENTAL ISSUES THAT ARE EASILY ADDRESSABLE. None

      • ARE PRIOR STUDIES REFERENCED APPROPRIATELY? Essentially yes. Regarding the reduction / loss of the adult epidermal Langerhans cell network, it may be of some interest to also refer to / discuss to another one of the few examples of this phenomenon. There, the initial burst of proliferation is followed by reduced proliferation and increased apoptosis when a critical member of the mTOR signaling cascade is conditionally knocked out (Blood 123:217, 2014).

      • ARE THE TEXT AND FIGURES CLEAR AND ACCURATE? Yes they are. Figures are well arranged for easy comprehension.

      • DO YOU HAVE SUGGESTIONS THAT WOULD HELP THE AUTHORS IMPROVE THE PRESENTATION OF THEIR DATA AND CONCLUSIONS?

      • Materials & Methods. The authors write, regarding flow cytometry of epidermal cells: "Briefly, 1cm2 of back skin from 8-14 weeks old female wild-type and knockout littermates was dissociated in 0.25 mg/mL Liberase (Sigma, cat. #5401020001) and 0.5 mg/mL DNase (Sigma, cat.#10104159001) in 1 mL of RPMI (Sigma) and mechanically disaggregated in Eppendorf tubes, FOLLOWED BY INCUBATED for 2 h at 37 {degree sign}C." Followed by what?

      • Materials & Methods. BMDC electronmicroscopy. What is "IF". Please specify.

      • RESULTS in gene expression analyses. The authors observe some increase in apoptosis (as detected by cleaved-Caspase-3 staining). Is this observation in immunofluorescence also evident in the RNAseq data (where the IFN changes were seen), i.e., in Figure 5.

      • Figure 7 F and G. Perhaps the authors may want to swap upper and lower panels in F or G, so that macrophage FACS plots and bar graphs are in the same row - ob, obiously, DC plots and bars likewise.

      • Figure 7H. "Gating strategy in ArpC4WT Lung (previously gated in Live CD45+ cells)" - The lower row is knock-out, not WT. This is indicated correctly in the legand, but in the figure both rows are labeled as WT.

      • The reference by Park et al. 2021 is missing in the list.

      • Figure 1D. Sure, the bar graphs are meant to say "CD11c"? The FACS plots show "CD11b".

      • As to cDC1. In Figure 1D the FACS plot shows an absence of CD103+ cDC1 cells. In contrast, In Figure 7A-left side panel, there is not difference in cDC1 cells between WT and KO mice. Is therefore the flow cytometry plot in Figure 1D not representative regarding cDC1 cells? Correct?

      Significance

      • DESCRIBE THE NATURE AND SIGNIFICANCE OF THE ADVANCE (E.G. CONCEPTUAL, TECHNICAL, CLINICAL) FOR THE FIELD. This is a conceptual advance. It adds a big step to our understanding of how immune cells in tissues (which all come from the bone marrow or are seeded before birth from embryonal hematopoietic organs such as yolk sac and fetal liver) can remain resident in these tissues. For cell types such as Langerhans cells, which establish their final population density within their tissues of residence, the presented finding convincingly buttress the role of proliferation and thereby the role for the actin-related protein complex 2/3 (Arp2/3).

      • PLACE THE WORK IN THE CONTEXT OF THE EXISTING LITERATURE (PROVIDE REFERENCES, WHERE APPROPRIATE). While we know much about actin-related proteins (Arp), as correctly cited by the authors, this knowledge is derived mostly from in vitro studies. The submitted study translates the findings to an in vivo setting for the first time.

      • STATE WHAT AUDIENCE MIGHT BE INTERESTED IN AND INFLUENCED BY THE REPORTED FINDINGS. Skin immunologists foremost, but these findings are of interest to the entire community of immunologists, but also cell biologists.

      • DEFINE YOUR FIELD OF EXPERTISE. My expertise is in skin immunology, in particular skin dendritic cells including Langerhans cells.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      • The manuscript by Delgado et al. reports the role of the actin remodeling Arp2/3 complex in the biology of Langerhans cells, which are specialized innate immune cells of the epidermis. The study is based on a conditional KO mouse model (CD11cCre;Arpc4fl/fl), in which the deletion of the Arp2/3 subunit ArpC4 is under the control of the myeloid cell specific CD11c promoter.

      • In this model, the assembly of LC networks in the epidermis of ear and tail skin is preserved when examining animals immediately after birth (up to 1 week). Subsequently however LCs from ArpC4-deleted mice start displaying morphological aberrations (reduced elongation and number of branches at 4 weeks of age). Additionally, a profound decline in LC numbers is reported in the skin of both the ear and tail of young adult mice (8-10 weeks).

      • To explore the cause of such decline, the authors then opt for the complementary in vitro study of bone-marrow derived DCs, given the lack of a model to study LCs in vitro. They report that ArpC4 deletion is associated with aberrantly shaped nuclei, decreased expression of the nucleoskeleton proteins Lamin A/C and B1, nuclear envelop ruptures and increased DNA damage as shown by γH2Ax staining. Importantly, they provide evidence that the defects evoked by ArpC4 deletion also occur in the LCs in situ (immunofluorescence of the skin in 4-week old mice).

      • Increased DNA damage is further documented by staining differentiating DCs from ArpC4-deleted mice with the 53BP1 marker. In parallel, nuclear levels of DNA repair kinase ATR and recruitment of RPA70 (which recruits ATR to replicative forks) are reduced in the ArpC4-deleted condition. In vitro treatment of DCs with the topoisomerase II inhibitor etoposide and the Arp2/3 inhibitor CK666 induce comparable DNA damage, as well as multilobulated nuclei and DNA bridges. The authors conclude that the ArpC4-KO phenotype might stem, at least in part, from a defective ability to repair DNA damages occurring during cell division.

      • The study in enriched by an RNA-seq analysis that points to an increased expression of genes linked to IFN signaling, which the authors hypothetically relate to overt activation of innate nucleic acid sensing pathways.

      • The study ends by an examination of myeloid cell populations in ArpC4-KO mice beyond LCs. Skin cDC2 and cDC2 subsets display skin emigration defects (like LCs), but not numerical defects in the skin (unlike LCs). Myeloid cell subsets of the colon are also present in normal numbers. In the lungs, interstitial and alveolar macrophages are reduced, but not lung DC subsets. Collectively, these observations suggest that ArpC4 is essential for the maintenance of myeloid cell subsets that rely on cell division to colonize or to self-maintain within their tissue of residency (including LCs).

      Major comments:

      1. ArpC4 and Arp2/3 expression

      The authors argue that LCs from Arpc4KO mice should delete the Arpc4 gene in precursors that colonize the skin around birth. It would be important to show it to rule out the possibility that the lack of phenotype (initial seeding, initial proliferative burst) in young animals (first week) could be related to an incomplete deletion of ArpC4 expression. Also important would be to show what is happening to the Arp2/3 complex in LCs from Arpc4KO mice. In the in vitro studies with DCs, the level of ArpC4 and Arp2/3 deletion at the protein level is also not documented. The authors explain that surface expression of the CD11c marker, which drives Arpc4 deletion, gradually increased during differentiation of DCs: from 50% to 90% of the cells. Does that mean that loss of ArpC4 expression is only effective in a fraction of the cells examined before day 10 of differentiation (e.g. in the RNA-seq analysis)?

      1. Intra-nuclear versus extra-nuclear activities of Arp2/3

      The authors favor a model whereby intra-nuclear ArpC4 helps maintaining nuclear integrity during proliferation of DCs (and possibly LCs). However, multiple pools of Arp2/3 have been described and accordingly, multiple mechanisms may account for the observed phenotype: i) cytoplasmic pool to drive the protrusions sustaining the assembly of the LC network and its connectivity with keratinocytes ; ii) peri-nuclear pool to protect the nucleus ; iii) Intra-nuclear pool to facilite DNA repair mechanisms e.g. by stabilizing replicative forks (the scenario favored by the authors).

      It is recommended that the authors try to gather more supportive data to sustain the intra-nuclear role. Documenting ArpC4 presence in the nucleus would help support the claim. It could be combined with treatments aiming at blocking proliferation in order to reinforce the possibility that a main function of ArpC4 is to protect proliferating cells by favoring DNA repair inside the nucleus.

      1. Nuclear envelop ruptures

      The nuclear envelop ruptures are not sufficiently documented (how many cells were imaged? quantification?). The authors employ STED microscopy to examine Lamin B1 distribution. The image shown in Figure 4A does not really highlight the nuclear envelop, but rather the entire content. Whether it is representative is questionable. We would expect Lamin B1 staining intensity to be drastically reduced given the quantification shown in Figure 3D. In addition, although the authors have stressed in the previous figure that Arpc4-KO is associated with nucleus shape aberrations, the example shown in Figure 4A is that of a nucleus with a normal ovoid shape.

      It is recommended to quantify the ruptures with Lap2b antibodies (or another staining that would better delineate the envelop) in order to avoid the possible bias due to the reduced staining intensity of Lamin B1.

      A missing analysis is that of nuclear envelop ruptures as a function of nucleus deformations.

      Fig 4B-C: same frequency of Arpc4-KO and WT cells displaying nuclear envelop ruptures in the 4-µm channels; however image show a rupture for the Arpc4-KO and no rupture for the WT cells (this is somehow misleading). Are ruptures similar in Arpc4-KO and WT cells in this condition?

      Fig 4D-E: is their a direct link between nuclear envelop ruptures and ƴH2A.X?

      Interesting (but optional) would be to understand what is happening to DNA, histones? Is their evidence for leakage in the cytoplasm?

      1. RNA seq analysis

      The RNA-seq analysis suffers from a lack of direct connection with the rest of the study. The extracted molecular information is not validated nor further explored. It remains very descriptive. The PCA analysis suggests a « more pronounced transcriptomic heterogeneity in differentiating Arpc4KO DCs ». However it seems difficult to make such a claim from the comparison of 3 mice per group. In addition, such heterogeneity is not seen in the more detailed analysis (Fig 5F). The authors claim that « day 10 control and Arpc4KO DCs showed no to very little differences in gene expression, in contrast to cells at days 7-9 of differentiation ». This is not obvious from the data displayed in the corresponding figure. In addition, it is not expected that cells that may take a divergent differentiation path at days 7-9 may would return to a similar transcriptional activity at day 10. A point that is not discussed is that before day 10 of DC differentiation, Arpc4 KO is expected to only occur in about 50% of the cell population. This is expected to impact the RNA-seq analysis. Not all clusters have been exploited (e.g. cluster 3 elevated, cluster 6 partly reduced). I suggest the authors reconsider their analysis and analysis of the RNA-seq analysis (or eventually invest in complementary analysis).

      1. What causes the profound numerical drop of LC in the epidermis?

      A major open question is what causes the massive drop of LCs. Although differentiating Arpc4KO DCs start accumulating DNA damage upon proliferation, they succeed in progressing through the cell cycle. There is even a slightly elevated expression of cell cycle genes at day 7 of differentiation in the DC model. Only a trend for increased apoptosis is observed in ear and tail skin. It would be important to provide complementary data documenting increased death (or aberrant emigration?) of LCs in the 4-8 week time window.

      1. Functional consequences

      Although the study reports novel aspects of LC biology, the consequence of ArpC4 deletion for skin barrier function and immunosurveillance are not investigated. It would seem very relevant to test how this model copes with radiation, chemical and/or microorganism challenges.

      Minor comments:

      1. Figure 1D

      Gating strategy: twice the same empty plots. The content seems to be missing... Does this need to be shown in the main figure?

      1. Figure 2

      Best would be to keep same scale to compare P1 and P7 (tail skin, figure 2A)

      Overlay of Ki67 and MHC-II does not allow to easily visualize the double-positive cells (Fig 2C)

      Quality of Ki67 staining different for Arpc4-KO (less intense, less focused to the nuclei): a technical issue or could that reflect something?

      Fig 2C: Panels mounted differently for ear and tail skin (different order to present the individual stainings, Dapi for tail skin only).

      1. LC branch analysis (Fig 1 and 2)

      While Fig 1 indicates that ear skin LCs form in average twice as few branches as tail skin LCs (3-4 versus 8-9 branches per cell), Fig 2 shows the opposite (10-12 versus 6-7 branches per cell). Is this due to a very distinct pattern between the 2 considered ages (4 weeks versus 8-10 weeks)? Could the author double-check that there is no methodological bias in their analysis?

      1. Fig 3 E-G

      How many animals were examined (n=5)? Reproducible accros animals? Why was it done with 4-week animals (phenotype not complete? Event occurring before loss in numbers...)

      Staining Lamin A/C globally more intense in the Arpc4-KO epidermis (also seems to apply to the masks corresponding to the LCs). Surprising to see that the quantification indicates a major drop of Lamin A/C intensity in the LCs.

      1. Legend Fig 4D replace confocal microscopy by STED microscopy

      2. Figure 4F

      Intensity/background of γH2Ax staining very distinct between the 2 micrographs shown for WT and Arpc4-KO epidermis.

      1. Figure 7C, F, H

      Gating strategies: would be better to harmonize the style of the plots (dot plots and 2 types of contour plots have been used...)

      1. Figure 7H

      Legend of lower gating strategy seems to be wrong (KO and not WT).

      Significance

      Strengths: the general quality of the manuscript is high. It is very clearly written and it contains a very detailed method section that would allow reproducing the reported experiments. This work entails a clear novelty in that it represents the first investigation of the role of ArpC4 in LCs. It opens an interesting perspective about specific mechanisms sustaining the maintenance of myeloid cell subsets in peripheral tissues. This work is therefore expected to be of interest for a large audience of cellular immunologists and beyond. Challenging skin function with an external trigger would lift the relevance for a even wider audience (see main point 6).

      Limitations: in its current version the manuscript suffers from a lack of solidity around a few analysis (see main points on ArpC4 and Arp2/3 protein expression, nuclear envelop rupture analysis,...). It also tends to formulate a narrative centered on the ArpC4 intra-nuclear function that is not definitely proven.

      The field of expertise of this reviewer is: cellular immunology and actin remodeling.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank the reviewers for their careful evaluation of our manuscript and for their constructive comments. The reviews recognise the relevance of the topic and the value of the multi-omics approach used to investigate host responses to SARS-CoV-2 variants in a physiologically relevant primary nasal epithelial cell model.

      In response to the reviewers' comments, we revised the manuscript to improve clarity of presentation, strengthened the contextualisation of the experimental design, and moderated the interpretation of the results. We also incorporated additional analyses based on existing datasets to better characterise infection burden and host responses across variants.

      Importantly, the MOI reported in the original manuscript (0.01) was a typographical error; all infections were performed at MOI 0.1 as documented in the GEO dataset (GSE271378). This was corrected throughout the manuscript.

      Overall, the revisions were intended to clarify the experimental framework, strengthen the integration of the multi-omics datasets, and ensure that the conclusions accurately reflect the scope of the study as a comparative systems-level analysis of variant-associated host-response signatures rather than a mechanistic dissection of individual pathways.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer summary: In the submitted work, authors seek to understand the transcriptional and metabolic changes induced by different variants of SARS-CoV-2 infection. They employ a model of pooled, primary nasal epithelial cells (NEC) differentiated within an air-liquid-interface. Subsequently, cultures are infected with isolates representing key variants of SAR-CoV-2 from initial D614G, Alpha, Beta, Delta, and Omicron. Following initial characterization, authors compare transcriptional changes at 24 and 72 hours post infection. Analysis focuses on differentially expressed genes, upregulated Interferon Stimulated genes, and transcripts within known metabolic pathways. Subsequently, infected cultures are also analyzed by phosphoproteomic analysis to identify changes in cell signaling and measured for amino acid content. Throughout, changes in each profile are correlated with different variants of SARS-CoV-2, with Delta and Omicron revealing particular diametrically opposed changes. There are reasonable controls applied, including the use of IFNalpha treatment to "benchmark" ISG production. Overall, authors create a picture that Omicron infections do not suppress IFN signaling as efficiently as Delta variants and further exhibit limited hallmarks of cell stress and metabolic dysregulation. This is a remarkable study that attempts to cross-correlate multiple -omics analyses of cell responses to characterize differences in infection. It is very well written and the data is exemplary. I do have some concerns related to the placement and emphasis of interpretation in the results section that need to be revised. Beyond my stylistic concern, the interpretation of the experimental changes between variants are compromised by the failure to analyze the extent of infection within the NEC model. Using an MOI of 0.01 will produce a dramatically heterogeneous extent of infection at both 24 and 72 hours post infection that will also depend on the extent of viral transmission within the culture. The limited analysis of secreted E-gene detection is insufficient to overcome the inherent unequal comparison of cell responses between variants. There are ways to assuage, but not eliminate, this problem when it comes to comparing and interpreting experimental results. My concerns and suggestions are detailed in the concerns below.

      Response: We thank Reviewer #1 for the very positive assessment of the study, which supports a decision to publish, and for the constructive suggestions. We agree that the interpretation of comparative bulk multi-omics data in differentiated NEC cultures should be carefully framed in light of variant-specific infection dynamics. In response, we revised the manuscript substantially. Importantly, the MOI reported in the manuscript (0.01) was corrected, as all infections in this study were performed at MOI 0.1, as correctly described in our GEO submission (GSE271378). We also quantified SARS-CoV-2 reads directly from the RNA-seq libraries to estimate infection burden in each sequenced sample, added infectious virus titre measurements (PFU/ml), expanded the analysis of IFNα-treated samples to include DEG and pathway-level comparisons, improved figure clarity and legends, and substantially tempered the interpretation throughout the Results and Discussion. We believe these revisions address the reviewer's concerns and strengthen the manuscript.

      1. Heterogeneous extent of infection. The MOI of 0.01 used to initiate infection is extraordinarily low for the types of analysis that is employed with the NEC culture. The interpretation of the data does not take into account that there will be infected and uninfected cells, of varying extents, making up the changes observed. Further, the variants likely have differing abilities to spread through the NEC culture, complicating both interpretation of changes and comparison between variants. At a minimum, authors need to evaluate the extent of SARS-CoV-2 infection through either flow cytometry or immunofluorescence analysis against viral protein(s). It is possible that Omicron, while secreted well, has more limited transmission allowing for more cells to mount an IFN response. Delta is a prolifically spreading virus that likely has more extensive infection at 72 hpi than the other variants. These statements are conjecture and highlight how such differences could alter the interpretation of the subsequent experiments. Response: We thank the reviewer for raising this important point. We would first like to clarify that the MOI reported in the manuscript (0.01) was a typographical error. All infections in this study were performed at MOI 0.1, as correctly documented in the RNA-seq dataset deposited in GEO (GSE271378). The manuscript text, Methods, and figure legends was corrected accordingly. MOI values in this range are commonly used for infections of differentiated airway epithelial cultures and allow productive infection while preserving epithelial integrity.

      We agree that infection heterogeneity is an important consideration when interpreting bulk transcriptomic, phosphoproteomic and metabolic measurements in differentiated epithelial cultures. We argue that such heterogeneity is expected in air-liquid interface nasal epithelial models, where SARS-CoV-2 infection occurs within a structured epithelium composed of multiple cell types and infected cells coexist with neighbouring bystander cells responding to paracrine interferon signalling. Bulk multi-omics measurements therefore capture the integrated epithelial response to infection rather than purely cell-intrinsic responses.

      To better contextualise infection burden within the sequenced samples, we included an additional analysis quantifying SARS-CoV-2 reads directly from each RNA-seq library and infectious virus titres (Figure 1). In the revised manuscript, these data are presented together in a new Supplementary Figure 1, which distinguishes intracellular viral RNA abundance from infectious virus production. The viral read analysis shows that intracellular viral RNA increases between 24 and 72 hpi across all variants and becomes broadly similar across lineages by 72 hpi, whereas plaque assays show that BA.1 has the highest early infectious output and Delta reaches the highest infectious titres at 48-72 hpi. We used these data to revise the Results and Discussion so that host-response differences are interpreted in the context of infection burden, while also making clear that intracellular viral RNA abundance, extracellular viral RNA output and infectious virus production are related but distinct measures of variant biology.

      Figure 1: Intracellular viral RNA reads (RNA-seq) (A) and infectious virus titres (PFU mL⁻¹; B) across SARS-CoV-2 variants.

      Further evaluation of IFNalpha treated cells. The paper emphasizes the ISG analysis, but the IFN treated cells should be included in the DEG and metabolic pathway analysis. IFN treatment is known to alter metabolic changes in cells, and it would be valuable to see those changes reflected in your analysis. Consider the evidence presented in the following: Fritsch SD, Weichhart T. Effects of Interferons and Viruses on Metabolism. Front Immunol. 2016 Dec 21;7:630. Heer CD, Sanderson DJ, Voth LS, Alhammad YMO, Schmidt MS, Trammell SAJ, Perlman S, Cohen MS, Fehr AR, Brenner C. Coronavirus infection and PARP expression dysregulate the NAD metabolome: An actionable component of innate immunity. J Biol Chem. Elsevier BV; 2020 Dec 25;295(52):17986-17996. Palmer CS. Innate metabolic responses against viral infections. Nat Metab. 2022 Oct;4(10):1245-1259 Further, It is possible that the changes attributed to Omicron are quite similar to the effects of the IFN treatment, given the extensive ISG detection. The same is true for the phosphor-proteomic analysis and amino acid content. I also have concerns that using a treatment of IFNalpha that impacts all cells as a benchmark for heterogeneous infection is not truly comparable. How was the concentration of IFN chosen? What was the extent of IFN activation in the culture?

      Response: In response to this suggestion, we performed pathway enrichment analysis of IFNα-treated samples to evaluate whether interferon stimulation alone induces the metabolic pathway signatures observed during viral infection. IFNα treatment produced the expected transcriptional interferon-stimulated gene programme but did not result in significant enrichment of the metabolic pathways highlighted in the infection comparisons (Figure 2). Specifically, pathways related to glycolysis/gluconeogenesis, glutathione metabolism, fatty acid metabolism, mitochondrial pathways, and oxidative phosphorylation showed only limited or modest negative enrichment scores and did not approach the magnitude of enrichment observed in virus-infected cultures. These results indicate that interferon signalling alone does not reproduce the metabolic pathway signatures associated with variant infection. The IFNα pathway analysis was included in the revised manuscript as supplementary data and referenced in the Results section.

      We agree that IFNα treatment of all cells is not directly equivalent to heterogeneous viral infection within differentiated NEC cultures. The IFNα concentration used was selected based on previous optimisation experiments showing robust induction of canonical ISGs in differentiated airway epithelial cultures. In the revised manuscript we clarified that the IFNα condition is used as a reference for interferon-responsive transcription rather than as a direct surrogate for infected cultures. We provided additional methodological clarification regarding how the IFNα concentration was selected and how interferon activation was benchmarked in NEC cultures.

      Further correlation of transcriptional changes with metabolic changes - While many published works emphasize transcriptional changes as a proxy for metabolic changes, there are robust methods that can be applied to directly analyze metabolite content and changes in the context of viral infection. In particular these studies should be assessed and compared for the interpretation of the presented results: Kramaric, T., Thein, O.S., Parekh, D. et al. SARS-CoV2 variants differentially impact on the plasma metabolome. Metabolomics 21, 50 (2025). Loveday EK, Welhaven H, Erdogan AE, Hain KS, Domanico LF, Chang CB, June RK, Taylor MP. Starve a cold or feed a fever? Identifying cellular metabolic changes following infection and exposure to SARS-CoV-2. PLoS One 2025 Feb 12;20(2):e0305065. Irún P, Gracia R, Piazuelo E, Pardo J, Morte E, Paño JR, Boza J, Carrera-Lasfuentes P, Higuera GA, Lanas A. Serum lipid mediator profiles in COVID-19 patients and lung disease severity: a pilot study. Sci Rep. 2023 Apr 20;13(1):6497. Luke Whiley, Nathan G. Lawler, Annie Xu Zeng, Alex Lee, Sung-Tong Chin, Maider Bizkarguenaga, Chiara Bruzzone, Nieves Embade, Julien Wist, Elaine Holmes, Oscar Millet, Jeremy K. Nicholson, and Nicola Gray, "Cross-Validation of Metabolic Phenotypes in SARS-CoV-2 Infected Subpopulations Using Targeted Liquid Chromatography-Mass Spectrometry (LC-MS)", Journal of Proteome Research 2024 23 (4), 1313-1327

      Response: We thank the reviewer for this important comment and agree that transcriptional pathway enrichment alone cannot establish metabolic flux or enzyme activity. Our intention in this study was to integrate transcriptomic signatures with complementary data layers, including phospho-signalling profiles and targeted intracellular amino acid quantification, to provide a comparative systems-level view of host responses to SARS-CoV-2 variants in nasal epithelial cells.

      We acknowledge that transcriptional enrichment does not necessarily reflect pathway activity and that our amino acid measurements represent a targeted metabolite readout rather than a comprehensive metabolomic or flux-based analysis. In the revised manuscript, we have therefore moderated the language used when describing metabolic changes and refered to pathway enrichment more cautiously as indicative of potential metabolic engagement rather than direct metabolic regulation.

      We have also expanded the Discussion to contextualise our findings with the metabolomic studies suggested by the reviewer and related work examining metabolic responses to SARS-CoV-2 infection.

      Editing to limit interpretation within experimental results. I appreciate that this is a stylistic concern and it is an issue in the paper. Statements in the results are often over-reaching. Some examples include: Line 156 -"suggesting attenuated or delayed early sensing" - The Low MOI and time leaves these results open to various explanations. Better to just state and move on. Line 157 "Delta drove the most extensive" - drove has a lot of assumption. "produced" "resulted in " or something more passive is more appropriate Line 179 "pointing to sustained suppression of interferon responses." - sustained is a leading interpretation. Effective? Comprehensive? again, the MOI is complicating interpretations of global transcript changes. Line 186 "suggesting a weaker activation of interferon signaling" Too much leading interpretation here. You detect fewer ISGs that are differentially regulated. Could be for many reasons.

      Response: We appreciate this comment and agree. We have revised the Results section throughout to make the language more descriptive and less interpretive. The specific examples highlighted by the reviewer were changed accordingly, and similar phrasing elsewhere in the Results was also softened. Mechanistic interpretation was reduced and moved to the Discussion where appropriate.

      Line 72 "has evolved unique strategies" Unique can be easily misconstrued to mean different mechanisms. More likely, it is a subtle balance between promotion of viral replication and suppression of IFN responses.

      Response: We agree and have revised this wording to avoid overstating mechanistic distinctiveness.

      Line 126 - 128 "NECs were derived from three commercially available donor pools". The following text doesn't make it clear that they are the same produce from different lots. The methods clarify somewhat, but should be clarified for transparency.

      Response: We thank the reviewer for noting this lack of clarity. We revised the relevant text in the Results and Methods to make clear that the NECs were derived from the same commercial product obtained across different lots/batches.

      Line 129 "Viral replication kinetics" Need to highlight that this is detection of secreted viral genomes. which is a proxy measure for replication and dissemination in the culture. Direct measurement of the extent of infection is not being made nor can be interpreted.

      Response: We agree and have revised the text and figure legend to clarify that the RT-qPCR measurements represent extracellular viral genome copies released into the apical supernatant and therefore provide a proxy measure of viral RNA output and dissemination within the culture rather than a direct measurement of infection extent. To better contextualise infection dynamics, we have complemented the RT-qPCR data with two additional measures of viral burden. First, we quantified SARS-CoV-2 reads directly from the RNA-seq libraries to estimate intracellular viral RNA abundance in the sequenced samples. Second, we measured infectious virus titres (PFU ml⁻¹) by plaque assay. These complementary analyses were presented in Supplementary Figure 1 and allow us to distinguish extracellular viral RNA release, intracellular viral RNA abundance, and infectious virus production. The revised manuscript explicitly acknowledges that bulk multi-omics measurements reflect mixtures of infected and bystander epithelial cells and therefore capture the integrated epithelial response to infection rather than the exact proportion of infected cells.

      Line 149 "Differentially expressed genes (DEGs)" What is the comparison group? The figure legend/design suggests that IFNa treatment. Is there a matched uninfected control for each timepoint as well? Later experiments specify the comparison group. Text should be clarified here for transparency.

      Response: We thank the reviewer for highlighting that the comparison group was not clearly described in the Results section. Differential expression analysis was performed by comparing each variant-infected condition with mock-infected control samples collected at 24 h. The same mock reference was used for comparisons at both 24 and 72 hpi. IFNα-treated samples were analysed separately and were not used as the reference condition for DEG identification. We have clarified this explicitly in the revised Results and Methods sections.

      Line 224 and Figure 4B - I don't see the value of the "merged NES" values given these are only aggregate of the Pre-Omicron and Omicron species. If you had compared multiple D614G and Delta variants, then there would be utility.

      Response: We agree that the merged NES values provide only a broad visual summary and that the most informative comparisons are at the individual variant level. In the revised manuscript we reduced the emphasis on the merged analysis and clarify in both the Results text and the figure legend that interpretation is primarily based on the variant-specific enrichment profiles, with lineage grouping shown only as a visual summary.

      Line 261 "quantified at 24 hpi" Why this timepoint? Changes were minor and not representative to extensive infection.

      Response: We thank the reviewer for this comment. The amino acid measurements were performed at 24 hpi to capture early metabolic responses to infection, in parallel with the phosphoproteomic analysis performed at the same time point. We agree that at this stage of infection the NEC cultures likely contain mixtures of infected and bystander epithelial cells, and therefore the amino acid measurements reflect the integrated metabolic state of the culture rather than infected cells alone. We clarified this rationale and limitation in the revised Results and Discussion sections.

      Line 268 "rather than variation in cell number." I appreciate the rigor and control of experimentation. And how many of those cells are infected? That is not controlled.

      Response: We thank the reviewer for this important point. We agree that normalisation to viable cell number does not control for infection heterogeneity within the cultures. In the revised manuscript, we revised this sentence to clarify that the amino acid measurements were normalised for cell number, but that, because they were obtained from bulk cultures at 24 hpi, they reflect the integrated metabolic state of infected and bystander cells rather than infected cells alone.

      Line 428-429 "direction of regulation" This seems like an over-interpretation of the data. You have performed pathway analysis based on the quantity of RNA transcription detected in sequencing then imputing an interpretation of regulation. Without pulse labeling of metabolic standards or kinetic analysis of metabolite quantity, it is difficult to assert regulatory direction.

      Response: We agree with the reviewer and have revised this wording accordingly. In the revised manuscript, we avoided describing pathway-level RNA-seq enrichment as direct regulation in a mechanistic sense. Instead, we refered more cautiously to positive or negative pathway enrichment based on transcript abundance patterns, which more accurately reflects the information provided by the enrichment analysis.

      Referee cross-commenting I am in agreement with the comments and suggestions of Reviewer #2 and #3. In particular, the comment of Reviewer #3 to estimate viral replication from the RNASeq data is quite valuable to begin addressing some of the concerns about the extent of viral replication. It does not negate the need to further assess productive viral titer (PFU/mL) or the extent of viral infection (immunofluorescence or flow cytometry). I also agree with Reviewer #3 regarding the extent of mechanistic interpretation that can be drawn from the current study. This concern can largely be addressed through revision of the text and a tempering of the interpretations that are drawn. I also agree and appreciate the detailed analysis of reviewer #2 regarding the inconsistencies between the text and the figures. It is critically important to be consistent in the data and presentation of these complex experiments. Resolving these issues will only strengthen the work.

      Response: We thank the reviewer for these additional comments and for highlighting the useful points raised by Reviewers #2 and #3. In line with these suggestions, we quantified viral reads directly from the RNA-seq libraries to provide an estimate of infection burden in the sequenced samples and included infectious virus titre measurements (PFU/ml) to complement the existing replication analyses. We agree that the current dataset supports a comparative systems-level analysis rather than strong mechanistic conclusions, and we therefore tempered the interpretation throughout the manuscript. Finally, we carefully reviewed and revised the figures, legends, and associated text to ensure consistency and clarity in the presentation of the data.

      Reviewer #1 (Significance (Required)):

      The work detailed in this manuscript is takes a very broad approach to identify differences in the effects of SARS-CoV-2 variant infections. Elements of this work have been published, including transcriptomics, metabolomics, and phosphoproteomics. This work is significant in that multiple variants are evaluated with comparable methods in the very relevant human nasal epithelial cell model. The use of this model, and the direct integration of multiple -omics, sets this work apart from previously published studies. This cross-omic analysis, with the IFN-treated controls, provides a robust foundation of data that can be used to detail the differences in the response to the SARS-CoV-2 variant infections. That said, a significant limitation to the study was the low MOI used to initiate infection and the lack of detailed analysis infection progression of the different variants. Further, there is limited comparison of the IFN-treatment condition in relation to the transcriptional changes, and no inclusion of IFN-controls in the other methods. Both of these limitations undercut the potential significance of the paper and its findings. Audience: This work will have be important to bench researchers interested in further characterizing and comparing the effects of SARS-CoV-2 infection. Potentially, clinicians involved in diagnostics will find utility in the study of changes for potential biomarker analysis for severe COVID19 disease. My expertise is the field of virology, having studying multiple RNA and DNA viruses, including SARS-CoV-2, to understand virus-cell interactions. My focus includes primary cell culture models of infection, proteomic and metabolic analysis of infection induced changes, and monitoring the spread of viral infection through direct and indirect measurements.

      Response: We thank the reviewer for the positive assessment of the significance of the study and for recognising the value of the integrated multi-omics analysis in a physiologically relevant human nasal epithelial cell model. We also appreciate the reviewer's constructive comments regarding infection burden and the interpretation of the IFNα reference condition. As noted above, the reported MOI of 0.01 was a typographical error and was corrected to 0.1 throughout the manuscript. To further address the reviewer's concerns regarding infection extent, we quantified viral reads directly from the RNA-seq libraries and include infectious virus titre measurements (PFU/ml) as an additional measure of productive infection. We also expanded the analysis of IFNα-treated samples to include differential expression and pathway-level comparisons, allowing more direct contextualisation of virus-induced transcriptional responses relative to a canonical interferon-stimulated programme. We believe that these revisions strengthen the interpretability and overall significance of the study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors of the manuscript entitled "Evolutionary rewiring of host metabolism and interferon signalling by SARS-CoV-2 variants" investigated the diversity of different SARS-CoV-2 isolates regarding gene expression, kinase activity and amino acid profiles in infected primary human nasal epithelial cells. Somji et al. found certain distinct alterations of measured factors after infections compared to mock and differences in cells infected with the mentioned different SARS-CoV-2 isolates. The topic of the manuscript as such is of high importance since understanding virus host interactions in general and virus host coevolution particularly on the level of cellular metabolism and beyond comes with great potential in deeper understanding the infection biology of viral invaders. Nevertheless, the study needs to be enlarged and further defined, the experimental set up has to be improved and the drawn conclusions have to be proven by experiments. The presentation of the obtained data needs to be improved, checked and carefully chosen to allow the reader to follow the article in a much more guided way. At this stage of experimental data depth, presentation and interpretation, there is room for certain overinterpretations of the biological meanings of the presented data.

      Response: We thank Reviewer #2 for the careful evaluation of the manuscript and for recognising the relevance of the topic. We agree that the original submission required clearer presentation, stronger contextualisation of the experimental design, and more cautious interpretation. In response, we revised the figures, legends, and linked text; clarify the number of biological and technical replicates for each experiment; added viral RNA read quantification from the RNA-seq libraries and infectious virus titres (PFU/ml); expanded the IFNα-related analyses; and moderated the conclusions throughout. We believe these revisions directly address the reviewer's core concerns.

      The authors state about virus growth kinetics in Fig.1. To be able to do so in full extend, virus particle counts (PFU/ml) need to be measured and included in this data set.

      Response: We agree and added the infectious virus titre measurements (PFU/ml) to complement the RT-qPCR genome measurements (Supplementary Figure 1).

      From Fig.2 on, the presentation and introduction of the data set is often very hard to follow. Certain panel labeling is not correct e.g. in Figure 2, Figure 2A is not introduced, 72h data are linked to Figure 2C but 2C is a Venn diagram of 24h gene expression downregulation. The Venn diagrams are not mentioned in the text at all. This problem is occurring at different occasion, which makes it hard to impossible to follow the experimental flow of the study. Therefore, a complete revision of the data presentation within the figures and the linked text is needed. Further example, lines 213 and 224, Figure 4B two times mentioned with different data supposed to be shown in Fig. 4B.

      Response: We thank the reviewer for identifying these issues. We comprehensively revised the figure panel labelling, figure legends, and linked text to ensure consistency and readability throughout the manuscript.

      The authors are inconsistent with including statistics in their figures. Please include all statistics in your figures to allow the reader to get this information. Please declare how often and how each experimental set has been done and clarify e.g. in the figure legends. In addition, please improve the figure quality for better allowance of cross comparability of data sets. As example, used the same x-axis scale for all graphs in Fig 4.

      Response: We agree and have revised the figures and legends accordingly. Statistical annotations have been added in the results section, and full values associated with the pathway enrichment analysis are now reported in Supplementary Table S2. For the amino acid measurements, individual biological replicate values are now displayed in the figure panels rather than only summary statistics. Replicate numbers and experimental design (biological replicates, technical replicates, and donor batches where relevant) are now explicitly stated in the figure legends and Methods section.

      To improve comparability across datasets, figure formatting was standardised throughout the manuscript. In particular, the x-axis scales in Figure 4 (below) were harmonised across panels to allow direct comparison of normalised enrichment scores between variants and time points. Additional adjustments were made to improve figure clarity, including consistent axis labelling, colour scales, and panel annotations.

      The authors create claims about metabolic profiles without measuring deeper metabolic circumstances. Why are only amino acids measured and not metabolite concentrations in general. Metabolic gene expressions as measurement of metabolic pathway activities can be strongly misleading since gene expression per definition does not necessarily mean enzyme activity, which of course is finally important for pathway activity as well.

      Response: We agree that amino acid profiling represents a targeted metabolic readout rather than a comprehensive metabolomic analysis, and that transcript abundance does not directly equal enzyme activity or flux. We have revised the manuscript throughout to reflect this limitation more clearly and to expand the Discussion to place our targeted amino acid data and pathway enrichment analyses in the appropriate context.

      The authors need to carefully crosslink the obtained data sets. As an easy example, how much of the found differences in gene expression, pathway activities etc. is due to viral growth differences. With other words, are there regulatory differences or are the differences seen due to different growth kinetics. Are ISG expression level linked to virus growth? These type of questions not be asked and correlations need to done by the authors to guide the reader through all those assays conducted in this study.

      Response: We agree that infection burden is an important variable when interpreting bulk multi-omics datasets obtained from infected epithelial cultures. To address this, we incorporated two additional measures of viral burden into the revised manuscript. First, SARS-CoV-2 reads were quantified directly from the RNA-seq libraries to estimate intracellular viral RNA abundance in the sequenced samples. Second, infectious virus titres (PFU ml⁻¹) were measured by plaque assay. These complementary datasets are presented in Supplementary Figure 1.

      In the revised manuscript, these measures are used to contextualise the transcriptomic and pathway analyses. Intracellular viral RNA reads increased across variants between 24 and 72 hpi and reached broadly comparable levels by 72 hpi, whereas infectious virus production differed between variants, with Delta producing the highest titres at later time points. We therefore revised the Results and Discussion to explicitly acknowledge that bulk transcriptomic, signalling and metabolic signatures may reflect both infection burden and variant-specific regulatory differences. For example, ISG induction at 72 hpi is discussed in the context of similar intracellular viral RNA levels across variants, indicating that differences in interferon-responsive transcription are not explained solely by viral RNA abundance.

      More broadly, we now emphasise that NEC cultures contain mixtures of infected and bystander epithelial cells and that the multi-omics datasets capture integrated epithelial responses rather than cell-intrinsic responses alone. These revisions strengthen the crosslinking between infection dynamics and host-response datasets while avoiding overinterpretation of variant-specific regulatory mechanisms.

      .

      Referee cross-commenting I do fully agree with reviewer 1 and 3 in terms of the importance of much more comprehensive data on virus growth. Measurement of real virus progeny (PFU/ml) and viral protein and RNA expression is needed to state about the importance of altering viral dynamics for interpreting the findings. I do fully agree with reviewer 1 and 3 that data analysis, presentation and interpretation has to be improved. Information such as how often has each experiment been done and how has the experimental set up been constructed has to be clarify e.g. in the figure legends. As reviewer 1 mentioned, direct analysis of metabolite concentrations is needed to be able to judge about metabolic changes driven by the different SARS CoV-2 variants. In line with both, reviewer 1 and 3, conclusions drawn by the authors should be toned down. More data and improved data analysis and presentation are needed to foster the conclusions drawn .

      Response: We thank the reviewer for these additional comments. In response, we added PFU/ml and RNA-seq-derived viral read data, improved experimental detail in the Methods and figure legends, clarified the scope and limitations of the amino acid measurements, and substantially moderated the interpretation throughout the manuscript.

      Reviewer #2 (Significance (Required)):

      While the topic as such is interesting and hoighly relevant, the manuscript has several major flaws, both with regard to paper organisation and content. In the current state it is hard to judge, whether the data are of significance.

      Response: We appreciate this assessment and hope that the extensive revisions in response to the reviewers' comments make the organisation, data presentation, and significance of the study much clearer.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript entitled "Evolutionary rewiring of host metabolism and interferon signalling by SARS-CoV-2 variants" by Somji and colleagues sets out to understand SARS-COV-2 variant biology in primary nasal epithelial cells. Understanding this and differences in variant-specific host-virus interactions is essential to understand the molecular basis of replication advantages and enhanced transmission that ultimately lead to variant dominance. The authors employ global transcriptomic, phosphor-proteomic and amino acid metabolism assays with the aim to identify variant-specific changes to cell metabolism and innate immune activation in a comparative systems-level approach. Importantly, this work is performed in primary nasal epithelial cells. It is essential to understand variant biology in the context of relevant primary cell infection models and NECs are a great choice to address the proposed research question. The work is conceptually interesting, but largely descriptive. While this can still be useful for the field, it requires appropriate framing of the interpretations of the data. I agree with the authors that there will be virus- specific signatures that will contribute to variant fitness, but this dataset makes it hard to draw strong conclusions. The main problem with the manuscript and the interpretation are dramatic differences in viral replication. While some of the conclusions are tantalising and would warrant further investigation, I would expect to see some experimental validation to substantiate the interpretation. In the absence of experimental validations and mechanism, the conclusions should be stated as such and contextualised more with previously published work.

      Response: We thank Reviewer #3 for the thoughtful and constructive assessment. We agree that the study is primarily comparative and systems-level in nature and that the original submission overreached in parts of the interpretation. In response, we moderated the conclusions and reframed the manuscript as a comparative analysis of lineage-associated host-response signatures that generates mechanistic hypotheses for future work, rather than claiming definitive causal mechanisms. We also added additional data on viral burden, revised the analysis description, improved figure presentation, and expanded the contextualisation with previously published work.

      A major concern that I have is the analysis of the RNASeq data. Experimental design, analysis and presented data require some clarification: Too little experimental detail for the RNASeq data is given. How many replicates were sequenced/analysed? The figure legend state three independent experiments - but how many individual replicate transwells per condition (and NEC batch) were used? This information needs to be included in the manuscript. Generally, clarification on how many replicates were used per experiment needs to be included in the figure legends for all data panels.

      Response: We thank the reviewer for highlighting the need for clearer reporting of RNA-seq experimental design and replicate structure. We have revised the manuscript to explicitly clarify replicate numbers, experimental batches, and sequencing quality control. RNA-seq experiments were performed using three independent batches of donor-pooled nasal epithelial cultures (MucilAir{trade mark, serif}). For most infection conditions and time points, two to three biological replicate transwells were sequenced per condition derived from independent NEC culture batches. A small number of libraries did not pass sequencing quality control thresholds (e.g. insufficient sequencing depth or technical library failure) and were therefore excluded from downstream analysis, resulting in minor variation in replicate numbers across conditions. To improve transparency, sequencing depth and library quality metrics for all RNA-seq libraries are now provided in Supplementary Table S6. In addition, principal component analysis (PCA) of the RNA-seq dataset has been included as Supplementary Figure S5 to illustrate sample clustering and replicate consistency across conditions. All raw sequencing data, sample metadata, and replicate information are available in the GEO repository under accession number GSE271378. We have also revised the Methods and figure legends throughout the manuscript to explicitly state replicate numbers for each dataset.

      .

      The reported log2 fold changes are beyond what is biologically reasonable. A log2 fold change of 120 or even 30 (Fig.3D, suppl table) indicates issues with the data analysis. It is worth revisiting the analysis and additionally inclusion of some QC data would be helpful (e.g. PCA of the data). Furthermore, viral genome data should be extracted from the RNASeq data to give an indication of infection levels in the relevant samples rather than just relying on a representative graph (Fig.1B).

      Response: Extremely large log₂ fold-change values can arise in RNA-seq analyses when strongly inducible genes are compared to control samples with very low baseline expression. This is common for interferon-stimulated genes (ISGs), which are often undetectable or expressed at near-background levels in mock samples but become highly induced following interferon signalling or viral infection. Similar magnitudes of induction have been reported in transcriptomic studies of interferon responses and SARS-CoV-2 infection of NEC (e.g. Hatton et al., 2021 (PMID: 34876592); Ziegler et al; 2021 (PMID: 34352228); Sharif-Askari et al., (PMID: 36415751) and other.

      To improve clarity, we have revisited the analysis and revised the visualisation of the RNA-seq data. Plotting scales and figure annotations have been adjusted to avoid misleading representation of extreme fold changes. In addition, we have included additional quality-control information for the RNA-seq dataset. Principal component analysis (PCA) of the RNA-seq samples has been added as Supplementary Figure S5 to illustrate sample clustering and replicate consistency, and sequencing depth and quality metrics for all libraries are provided in Supplementary Table S6.

      As suggested by the reviewer, we also quantified viral genome reads directly from the RNA-seq libraries to estimate intracellular viral RNA abundance in the sequenced samples. These data are now presented in Supplementary Figure 1 and discussed in the Results to contextualise infection levels across conditions alongside the extracellular viral RNA measurements shown in Figure 1B.

      Please include virus replication data for all experiments. Only one replication graph is shown (Fig. 1B), but infection level/virus release should be reported for every assay as responses will of course be dependent on how much virus/how many infected cells are present. A difficulty in understanding variant specific host responses in comparative approaches is differences in infection levels. In line with other published work, Fig.1B shows dramatic differences in variant replication. The differences measured at 1hpi indicate issues with input normalisation, this will have a knock-on effect for later replication and ultimately will further increase differences in infected cell counts. L340-342 "These transcriptional shifts occurred despite broadly comparable viral loads across lineages at 24-72 hpi, suggesting that replication level alone does not account for the observed metabolic divergence." - I strongly disagree with this interpretation. The viral loads are clearly not comparable. A 2 log10 difference in virus release is a large difference that will affect the comparison of host response. These replication difference are to be expected and have been previously reported by others. Ancestral variants infect fewer cells compared to Omicron variants. This needs to be acknowledged. In a bulk RNASeq/phopshoproeomic/metabolic measurement the number of infected and uninfected bystander cells across variants will inevitably result in the identification of at least some host responses that correlate with infection levels rather than with specific biology exploited by a variant. The authors must acknowledge this and discuss the contribution of infected vs bystander cells.

      Response: We thank the reviewer for this important point. All downstream analyses in this study (RNA-seq, phosphoproteomics and amino acid profiling) were performed on matched cultures from the same infection experiment; therefore, the replication kinetics shown in Figure 1B represent the infection conditions for all assays. We clarified this explicitly in the Methods and figure legends.

      We agree that differences in viral replication across variants are important when interpreting host responses. The statement suggesting broadly comparable viral loads was removed. We also included PFU/ml measurements and quantify viral reads extracted from the RNA-seq libraries to provide additional estimates of infection burden. Finally, we expanded the Discussion to acknowledge that bulk omics measurements reflect a mixture of infected and bystander cells and that some observed host responses may partly correlate with differences in infection levels across variants.

      Include individual data points to show the spread of the data overall (Fig 6A). Just showing the mean without an indication of how many measurements were taken and the variation in the data makes it hard for the reader to interpret the data.

      Response: We agree that the variability across replicates should be indicated. We added individual data points and error bars to Figure 6A (below) and clarified the replicate structure in the figure legend and Methods. Amino acid measurements were performed using four biological replicates per condition, each processed in duplicate technical measurements that were averaged prior to statistical analysis.

      The choice of 24h for the amino acid abundance analysis needs some further justification. At 24h, some variants will only have infected very few cells. What would this mean for a bulk measurement? Do the authors suggest that there were changes to aa-metabolism in uninfected bystander cells? Would true differences in aa-metabolism in the infected cells be masked by the surrounding uninfected cells?

      Response: We thank the reviewer for this important point. We selected 24 hpi to capture early metabolic responses, in parallel with the phosphoproteomic analyses, before the later-stage transcriptional divergence became dominant. We agree, however, that at this timepoint the amino acid measurements likely reflect the integrated state of both infected and bystander cells within the cultures. We clarified this explicitly in the revised manuscript and discussed this as an important limitation of the bulk metabolite measurements.

      The framing of Alpha and Beta as pre-Omicron is confusing. IC19 and Delta are both equally pre-Omicron variants. Please consider rewording.

      Response: We agree that this terminology is potentially confusing. In the revised manuscript, we used more precise lineage descriptions throughout, distinguishing IC19 as the reference/early strain, Alpha and Beta as earlier VOCs, Delta as a separate later pre-Omicron VOC, and BA.1/BA.5 as Omicron subvariants.

      The Venn diagram in Fig. 2B/C is hard to interpret. How were the percentages calculated? From the total number of DEG across all variants? If so, this would inflate the proportion attributed to the conditions that showed the largest number of DEG genes and shrink the proportion for the conditions with less signal. An UpSet plot might be a better choice to represent the data.

      Response: We thank the reviewer for this helpful suggestion. The overlap values in Fig. 2B-C were generated using __InteractiVenn____, __which calculates set intersections and reports them as percentages relative to the total union of differentially expressed genes across all variants at the respective time point. We clarified this explicitly in the figure legend and Methods. We agree that multi-set Venn diagrams can be difficult to interpret when DEG set sizes differ substantially, and we revised the figure legends and associated text to improve clarity of presentation.

      The interpretation of the data as presented requires more mechanistic validation. As it stands, activation of metabolic pathways, or the contribution of the observed phospho changes to variant biology, is not functionally linked to infection outcome. In the absence of more experimental data, the conclusions should be toned down. (For example L330-332 "These patterns suggest that Omicron can replicate despite ongoing cytokine signalling, whereas Delta infection favours stress- and growth-linked pathways to sustain replication.")

      Response: We agree and substantially toned down these statements. The revised manuscript presents these data as comparative host-response signatures rather than mechanistically validated pathways driving infection outcome.

      L440-442 "Similarly, replicate-level variability and confidence intervals for NES values were not plotted, as the scores reflect ranked enrichment rather than absolute expression magnitude." - What do the authors mean by replicate-level variability? I assume the NES was calculated based on fold change which are not replicate-level?

      Response: We thank the reviewer for pointing out this lack of clarity. The previous wording referring to "replicate-level variability" was removed. We now clarify that NES values were calculated from ranked differential expression outputs, with nominal p-values estimated by permutation and FDR-(adjusted p-values) reported in Supplementary Table S2, together with leading-edge genes for each pathway, variant and time point.

      Differences in Oxphos have been reported by others (https://www.sciencedirect.com/science/article/pii/S2589004224012343 ). This study and others should be included in the discussion.

      Response: We thank the reviewer for highlighting this study. We have now included this in the Discussion to place our findings in the context of previous studies of SARS-CoV-2 infection in nasal epithelial cultures.

      Can the authors speculate whether the innate immune response observed links to the metabolic changes reported?

      Response: While our study does not directly establish a causal link between innate immune activation and metabolic rewiring, interferon signalling is known to influence cellular metabolism during viral infection. In our dataset, IFNα-treated cultures showed strong ISG induction but minimal enrichment of the metabolic pathways analysed here (new Supplementary Figure 4), suggesting that interferon signalling alone does not fully account for the metabolic signatures observed during SARS-CoV-2 infection. These observations support the idea that the metabolic changes detected likely reflect a combination of viral replication demands and host antiviral signalling rather than interferon activation alone. We have added a brief clarification in the Discussion to acknowledge this relationship.

      Overall, in the discussion the data should be contextualised with results from other studies. Particularly work focussing on primary airway epithelial cells and variant infections.

      Response: We agree and expanded the Discussion to better contextualise our results within the existing literature, particularly studies in primary airway epithelial models.

      Please provide more detail on how the merged NES was calculated for Alpha/Beta and BA.1/BA.5. For Fig. 4, either the merged or the unmerged NES data would be sufficient, rather than including both analyses. Enrichment of pathways would benefit from indicating which genes associated have been detected and how they functionally might contribute.

      Response: We thank the reviewer for this helpful suggestion. In the revised manuscript, we simplified the presentation of the pathway enrichment analysis by focusing primarily on the variant-level NES profiles (Figure 4A), while retaining the grouped lineage visualisation (Figure 4B) only as a simplified overview. The merged NES values were calculated by averaging the normalised enrichment scores of the corresponding variants within each lineage group (Alpha/Beta for pre-Omicron and BA.1/BA.5 for Omicron). To improve interpretability, we now report the leading-edge genes contributing to each enriched pathway in Supplementary Table S2.

      Please include how ISGs were defined for the analysis of Fig. 3.

      Response: In our analysis, interferon-stimulated genes (ISGs) were defined based on the transcriptional response of nasal epithelial cells to IFN-α stimulation, which served as a benchmark condition for interferon-responsive gene expression. Genes significantly up-regulated in IFN-α-treated samples relative to mock controls were used to define the ISG set analysed in Fig. 3. We clarified this definition and the selection criteria in the Methods and figure legend.

      Please clarify for each experiment how many replicates/measurements were taken. This information should be included in the figure legend. If data/or measurements were excluded, this should also be highlighted. From the supplementary data (amino acid data; qPCR vs RNASeq) there seems to be variation in the amount of reported measurements (aa-metabolism: 7 vs 8 measurements; RNASeqvsqPCR: 37 vs 39 measurements).

      Response: We agree and have revised the Methods, figure legends, and supplementary information to clarify replicate numbers and any exclusions.

      • 3D: The colour scaling used for log2FC is unbalanced. Consider using different gradings.*

      Response: We thank the reviewer for this suggestion. The colour scale in Fig. 3D was intentionally asymmetric because several IFNα-responsive genes show extremely large log₂fold changes due to very low baseline expression in mock samples. Using a symmetric colour scale would compress the dynamic range of the virus-infected conditions and obscure biologically meaningful differences between variants. To avoid confusion, we clarified the rationale for the colour scaling in the figure legend and ensure that the scale is clearly labelled.

      In the NES analysis, I would expect an indication of the leading edge in the figure or in the supplementary data.

      Response: We agree. Relevant leading-edge information has been added to the supplementary table S2 and is referenced in the revised Methods and Results.

      Several figures would benefit from inclusion of p-values/indication of significance (Fig. 3D, 5B, 6A/C).

      Response: We agree and we have added statistical information where appropriate including the supplementary material.

      Fig .6D requires some more explanation as to what it is shown in the figure. Statistics should be included to confirm that there are no overall differences between conditions.

      Response: We agree and expanded the explanation of Figure 6D. Amino acid levels were normalised to the total intracellular amino acid pool within each condition to evaluate proportional composition independent of total abundance. We also included statistical analysis of the normalised amino acid proportions using a Friedman test, which detected modest but significant differences across conditions (χ² = 15.33, p = 0.004). These differences reflect small shifts in a limited number of amino acids rather than major changes in overall amino acid composition. The statistical analysis and clarification have been added to the Results, Figure 6 legend, and Supplementary Table S4.

      L266-269 "All amino acid measurements were expressed as nmol per 10⁶ viable, counted cells, and viability at 24 hpi was comparable across conditions, indicating that differences in abundance reflect infection-driven metabolic changes rather than variation in cell number." - Data should be included.

      Response: We thank the reviewer for this comment. In differentiated air-liquid interface nasal epithelial cultures, cells form a structured epithelium attached to the transwell membrane and cannot be routinely counted without dissociation of the insert, which would disrupt the culture and preclude subsequent metabolic analysis. For this reason, individual experimental inserts used for amino acid measurements were not dissociated. Instead, representative inserts were dissociated to verify epithelial cell numbers. Dissociation of one mock control and one IC19-infected insert yielded comparable counts of 0.9-1.10 × 10⁶ epithelial cells per insert, confirming that each transwell contains approximately 10⁶ epithelial cells. Amino acid measurements were therefore normalised to epithelial input and reported as nmol per 10⁶ cell equivalents. The manuscript text was revised to clarify this normalisation and avoid implying routine viable cell counting of each insert.

      L270-273: "The variant-amino acid interaction network (Figure 6B) visualises these differences by linking each variant to its most strongly altered amino acids. Edge width reflects the absolute log fold change, and colour indicates direction (red for increases, blue for decreases relative to mock)." The network figure does not add any additional information that is not already contained in Fig. 6C. Consider removing this panel.

      Response: We thank the reviewer for this suggestion. While the quantitative differences in amino acid abundance are shown in Fig. 6C, the network representation in Fig. 6B was included to highlight variant-metabolite relationships and to visualise which amino acids show the strongest associations with individual viral lineages. This representation facilitates comparison of the pattern of metabolic alterations across variants rather than only their magnitude. To avoid redundancy, we clarified this purpose in the figure legend and streamline the figure presentation.

      For Fig. 6C: The colour scale for the legend is imbalanced starting at -1 with a mid point at 0 and the max at 0.35.

      Response: We thank the reviewer for noting this point. The colour scale reflects the observed range of log₂ fold changes in the dataset, where decreases in amino acid abundance were larger in magnitude than increases. As a result, the scale is asymmetric. To avoid confusion, we clarified the colour scale in the figure legend and ensure that it is clearly labelled to reflect the underlying data distribution.

      L428-41 "Dot colour indicated the direction of regulation (red, up-regulated; blue, down-regulated), and dot size was proportional to the absolute NES value. Vertical reference lines at NES = 0 were included to indicate neutral enrichment." This does not describe the data that is presented in the figure.

      Response: We thank the reviewer for pointing out that the previous description did not accurately reflect the graphical representation. Figure 4 has been revised to clarify how pathway enrichment is displayed. Dot position now represents the normalised enrichment score (NES) on a common scale across all panels, and dot colour indicates the direction of enrichment (red = positive enrichment, blue = negative enrichment). A shaded central region highlights limited enrichment around NES = 0, and the scale at the bottom indicates thresholds used to categorise moderate and strong enrichment. The figure legend and Methods description have been updated accordingly.

      Reviewer #3 (Significance (Required)):

      My criticisms are in part outlined above. While the central question of the study is important and timely, the data reported is largely incremental and lacks mechanistic insight.

      Response: We thank the reviewer for this candid assessment. We agree that the present study is not a mechanistic dissection of individual pathways, but rather a comparative systems-level analysis of lineage-associated host-response patterns in a physiologically relevant NEC model. We have revised the manuscript to better reflect this scope and to avoid overstatement of mechanistic inference.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript entitled "Evolutionary rewiring of host metabolism and interferon signalling by SARS-CoV-2 variants" by Somji and colleagues sets out to understand SARS-COV-2 variant biology in primary nasal epithelial cells. Understanding this and differences in variant-specific host-virus interactions is essential to understand the molecular basis of replication advantages and enhanced transmission that ultimately lead to variant dominance. The authors employ global transcriptomic, phosphor-proteomic and amino acid metabolism assays with the aim to identify variant-specific changes to cell metabolism and innate immune activation in a comparative systems-level approach. Importantly, this work is performed in primary nasal epithelial cells. It is essential to understand variant biology in the context of relevant primary cell infection models and NECs are a great choice to address the proposed research question.

      The work is conceptually interesting, but largely descriptive. While this can still be useful for the field, it requires appropriate framing of the interpretations of the data. I agree with the authors that there will be virus- specific signatures that will contribute to variant fitness, but this dataset makes it hard to draw strong conclusions. The main problem with the manuscript and the interpretation are dramatic differences in viral replication. While some of the conclusions are tantalising and would warrant further investigation, I would expect to see some experimental validation to substantiate the interpretation. In the absence of experimental validations and mechanism, the conclusions should be stated as such and contextualised more with previously published work.

      Major:

      1. A major concern that I have is the analysis of the RNASeq data. Experimental design, analysis and presented data require some clarification: Too little experimental detail for the RNASeq data is given. How many replicates were sequenced/analysed? The figure legend state three independent experiments - but how many individual replicate transwells per condition (and NEC batch) were used? This information needs to be included in the manuscript. Generally, clarification on how many replicates were used per experiment needs to be included in the figure legends for all data panels.

      2. The reported log2 fold changes are beyond what is biologically reasonable. A log2 fold change of 120 or even 30 (Fig.3D, suppl table) indicates issues with the data analysis. It is worth revisiting the analysis and additionally inclusion of some QC data would be helpful (e.g. PCA of the data). Furthermore, viral genome data should be extracted from the RNASeq data to give an indication of infection levels in the relevant samples rather than just relying on a representative graph (Fig.1B).

      3. Please include virus replication data for all experiments. Only one replication graph is shown (Fig. 1B), but infection level/virus release should be reported for every assay as responses will of course be dependent on how much virus/how many infected cells are present. A difficulty in understanding variant specific host responses in comparative approaches is differences in infection levels. In line with other published work, Fig.1B shows dramatic differences in variant replication. The differences measured at 1hpi indicate issues with input normalisation, this will have a knock-on effect for later replication and ultimately will further increase differences in infected cell counts. L340-342 "These transcriptional shifts occurred despite broadly comparable viral loads across lineages at 24-72 hpi, suggesting that replication level alone does not account for the observed metabolic divergence." - I strongly disagree with this interpretation. The viral loads are clearly not comparable. A 2 log10 difference in virus release is a large difference that will affect the comparison of host response. These replication difference are to be expected and have been previously reported by others. Ancestral variants infect fewer cells compared to Omicron variants. This needs to be acknowledged. In a bulk RNASeq/phopshoproeomic/metabolic measurement the number of infected and uninfected bystander cells across variants will inevitably result in the identification of at least some host responses that correlate with infection levels rather than with specific biology exploited by a variant. The authors must acknowledge this and discuss the contribution of infected vs bystander cells.

      4. Include individual data points to show the spread of the data overall (Fig 6A). Just showing the mean without an indication of how many measurements were taken and the variation in the data makes it hard for the reader to interpret the data.

      5. The choice of 24h for the amino acid abundance analysis needs some further justification. At 24h, some variants will only have infected very few cells. What would this mean for a bulk measurement? Do the authors suggest that there were changes to aa-metabolism in uninfected bystander cells? Would true differences in aa-metabolism in the infected cells be masked by the surrounding uninfected cells?

      6. The framing of Alpha and Beta as pre-Omicron is confusing. IC19 and Delta are both equally pre-Omicron variants. Please consider rewording.

      7. The Venn diagram in Fig. 2B/C is hard to interpret. How were the percentages calculated? From the total number of DEG across all variants? If so, this would inflate the proportion attributed to the conditions that showed the largest number of DEG genes and shrink the proportion for the conditions with less signal. An UpSet plot might be a better choice to represent the data.

      8. The interpretation of the data as presented requires more mechanistic validation. As it stands, activation of metabolic pathways, or the contribution of the observed phospho changes to variant biology, is not functionally linked to infection outcome. In the absence of more experimental data, the conclusions should be toned down. (For example L330-332 "These patterns suggest that Omicron can replicate despite ongoing cytokine signalling, whereas Delta infection favours stress- and growth-linked pathways to sustain replication.")

      9. L440-442 "Similarly, replicate-level variability and confidence intervals for NES values were not plotted, as the scores reflect ranked enrichment rather than absolute expression magnitude." - What do the authors mean by replicate-level variability? I assume the NES was calculated based on fold change which are not replicate-level?

      10. Differences in Oxphos have been reported by others (https://www.sciencedirect.com/science/article/pii/S2589004224012343 ). This study and others should be included in the discussion.

      11. Can the authors speculate whether the innate immune response observed links to the metabolic changes reported?

      Minor:

      1. Overall, in the discussion the data should be contextualised with results from other studies. Particularly work focussing on primary airway epithelial cells and variant infections.

      2. Please provide more detail on how the merged NES was calculated for Alpha/Beta and BA.1/BA.5. For Fig. 4, either the merged or the unmerged NES data would be sufficient, rather than including both analyses. Enrichment of pathways would benefit from indicating which genes associated have been detected and how they functionally might contribute.

      3. Please include how ISGs were defined for the analysis of Fig. 3.

      4. Please clarify for each experiment how many replicates/measurements were taken. This information should be included in the figure legend. If data/or measurements were excluded, this should also be highlighted. From the supplementary data (amino acid data; qPCR vs RNASeq) there seems to be variation in the amount of reported measurements (aa-metabolism: 7 vs 8 measurements; RNASeqvsqPCR: 37 vs 39 measurements).

      5. Fig. 3D: The colour scaling used for log2FC is unbalanced. Consider using different gradings.

      6. In the NES analysis, I would expect an indication of the leading edge in the figure or in the supplementary data.

      7. Several figures would benefit from inclusion of p-values/indication of significance (Fig. 3D, 5B, 6A/C).

      8. Fig .6D requires some more explanation as to what it is shown in the figure. Statistics should be included to confirm that there are no overall differences between conditions.

      9. L266-269 "All amino acid measurements were expressed as nmol per 10⁶ viable, counted cells, and viability at 24 hpi was comparable across conditions, indicating that differences in abundance reflect infection-driven metabolic changes rather than variation in cell number." - Data should be included.

      10. L270-273: "The variant-amino acid interaction network (Figure 6B) visualises these differences by linking each variant to its most strongly altered amino acids. Edge width reflects the absolute log₂ fold change, and colour indicates direction (red for increases, blue for decreases relative to mock)." The network figure does not add any additional information that is not already contained in Fig. 6C. Consider removing this panel.

      11. For Fig. 6C: The colour scale for the legend is imbalanced starting at -1 with a mid point at 0 and the max at 0.35.

      12. L428-41 "Dot colour indicated the direction of regulation (red, up-regulated; blue, down-regulated), and dot size was proportional to the absolute NES value. Vertical reference lines at NES = 0 were included to indicate neutral enrichment." This does not describe the data that is presented in the figure.

      Significance

      My criticisms are in part outlined above. While the central question of the study is important and timely, the data reported is largely incremental and lacks mechanistic insight.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors of the manuscript entitled "Evolutionary rewiring of host metabolism and interferon signalling by SARS-CoV-2 variants" investigated the diversity of different SARS-CoV-2 isolates regarding gene expression, kinase activity and amino acid profiles in infected primary human nasal epithelial cells. Somji et al. found certain distinct alterations of measured factors after infections compared to mock and differences in cells infected with the mentioned different SARS-CoV-2 isolates.

      The topic of the manuscript as such is of high importance since understanding virus host interactions in general and virus host coevolution particularly on the level of cellular metabolism and beyond comes with great potential in deeper understanding the infection biology of viral invaders.

      Nevertheless, the study needs to be enlarged and further defined, the experimental set up has to be improved and the drawn conclusions have to be proven by experiments. The presentation of the obtained data needs to be improved, checked and carefully chosen to allow the reader to follow the article in a much more guided way. At this stage of experimental data depth, presentation and interpretation, there is room for certain overinterpretations of the biological meanings of the presented data.

      Please find a detailed list of comments for the consideration of the authors below.

      1. The authors state about virus growth kinetics in Fig.1. To be able to do so in full extend, virus particle counts (PFU/ml) need to be measured and included in this data set.

      2. From Fig.2 on, the presentation and introduction of the data set is often very hard to follow. Certain panel labeling is not correct e.g. in Figure 2, Figure 2A is not introduced, 72h data are linked to Figure 2C but 2C is a Venn diagram of 24h gene expression downregulation. The Venn diagrams are not mentioned in the text at all. This problem is occurring at different occasion, which makes it hard to impossible to follow the experimental flow of the study. Therefore, a complete revision of the data presentation within the figures and the linked text is needed. Further example, lines 213 and 224, Figure 4B two times mentioned with different data supposed to be shown in Fig. 4B.

      3. The authors are inconsistent with including statistics in their figures. Please include all statistics in your figures to allow the reader to get this information. Please declare how often and how each experimental set has been done and clarify e.g. in the figure legends. In addition, please improve the figure quality for better allowance of cross comparability of data sets. As example, used the same x-axis scale for all graphs in Fig 4.

      4. The authors create claims about metabolic profiles without measuring deeper metabolic circumstances. Why are only amino acids measured and not metabolite concentrations in general. Metabolic gene expressions as measurement of metabolic pathway activities can be strongly misleading since gene expression per definition does not necessarily mean enzyme activity, which of course is finally important for pathway activity as well.

      5. The authors need to carefully crosslink the obtained data sets. As an easy example, how much of the found differences in gene expression, pathway activities etc. is due to viral growth differences. With other words, are there regulatory differences or are the differences seen due to different growth kinetics. Are ISG expression level linked to virus growth? These type of questions not be asked and correlations need to done by the authors to guide the reader through all those assays conducted in this study.

      Referee cross-commenting

      I do fully agree with reviewer 1 and 3 in terms of the importance of much more comprehensive data on virus growth. Measurement of real virus progeny (PFU/ml) and viral protein and RNA expression is needed to state about the importance of altering viral dynamics for interpreting the findings.

      I do fully agree with reviewer 1 and 3 that data analysis, presentation and interpretation has to be improved. Information such as how often has each experiment been done and how has the experimental set up been constructed has to be clarify e.g. in the figure legends.

      As reviewer 1 mentioned, direct analysis of metabolite concentrations is needed to be able to judge about metabolic changes driven by the different SARS CoV-2 variants.

      In line with both, reviewer 1 and 3, conclusions drawn by the authors should be toned down. More data and improved data analysis and presentation are needed to foster the conclusions drawn .

      Significance

      While the topic as such is interesting and hoighly relevant, the manuscript has several major flaws, both with regard to paper organisation and content. In the current state it is hard to judge, whether the data are of significance.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In the submitted work, authors seek to understand the transcriptional and metabolic changes induced by different variants of SARS-CoV-2 infection. They employ a model of pooled, primary nasal epithelial cells (NEC) differentiated within an air-liquid-interface. Subsequently, cultures are infected with isolates representing key variants of SAR-CoV-2 from initial D614G, Alpha, Beta, Delta, and Omicron. Following initial characterization, authors compare transcriptional changes at 24 and 72 hours post infection. Analysis focuses on differentially expressed genes, upregulated Interferon Stimulated genes, and transcripts within known metabolic pathways. Subsequently, infected cultures are also analyzed by phosphoproteomic analysis to identify changes in cell signaling and measured for amino acid content. Throughout, changes in each profile are correlated with different variants of SARS-CoV-2, with Delta and Omicron revealing particular diametrically opposed changes. There are reasonable controls applied, including the use of IFNalpha treatment to "benchmark" ISG production. Overall, authors create a picture that Omicron infections do not suppress IFN signaling as efficiently as Delta variants and further exhibit limited hallmarks of cell stress and metabolic dysregulation.

      This is a remarkable study that attempts to cross-correlate multiple -omics analyses of cell responses to characterize differences in infection. It is very well written and the data is exemplary. I do have some concerns related to the placement and emphasis of interpretation in the results section that need to be revised. Beyond my stylistic concern, the interpretation of the experimental changes between variants are compromised by the failure to analyze the extent of infection within the NEC model. Using an MOI of 0.01 will produce a dramatically heterogeneous extent of infection at both 24 and 72 hours post infection that will also depend on the extent of viral transmission within the culture. The limited analysis of secreted E-gene detection is insufficient to overcome the inherent unequal comparison of cell responses between variants. There are ways to assuage, but not eliminate, this problem when it comes to comparing and interpreting experimental results. My concerns and suggestions are detailed in the concerns below.

      Major Concerns:

      1) Heterogeneous extent of infection. The MOI of 0.01 used to initiate infection is extraordinarily low for the types of analysis that is employed with the NEC culture. The interpretation of the data does not take into account that there will be infected and uninfected cells, of varying extents, making up the changes observed. Further, the variants likely have differing abilities to spread through the NEC culture, complicating both interpretation of changes and comparison between variants. At a minimum, authors need to evaluate the extent of SARS-CoV-2 infection through either flow cytometry or immunofluorescence analysis against viral protein(s). It is possible that Omicron, while secreted well, has more limited transmission allowing for more cells to mount an IFN response. Delta is a prolifically spreading virus that likely has more extensive infection at 72 hpi than the other variants. These statements are conjecture and highlight how such differences could alter the interpretation of the subsequent experiments.

      2) Further evaluation of IFNalpha treated cells. The paper emphasizes the ISG analysis, but the IFN treated cells should be included in the DEG and metabolic pathway analysis. IFN treatment is known to alter metabolic changes in cells, and it would be valuable to see those changes reflected in your analysis. Consider the evidence presented in the following:

      Fritsch SD, Weichhart T. Effects of Interferons and Viruses on Metabolism. Front Immunol. 2016 Dec 21;7:630.

      Heer CD, Sanderson DJ, Voth LS, Alhammad YMO, Schmidt MS, Trammell SAJ, Perlman S, Cohen MS, Fehr AR, Brenner C. Coronavirus infection and PARP expression dysregulate the NAD metabolome: An actionable component of innate immunity. J Biol Chem. Elsevier BV; 2020 Dec 25;295(52):17986-17996.

      Palmer CS. Innate metabolic responses against viral infections. Nat Metab. 2022 Oct;4(10):1245-1259

      Further, It is possible that the changes attributed to Omicron are quite similar to the effects of the IFN treatment, given the extensive ISG detection. The same is true for the phosphor-proteomic analysis and amino acid content. I also have concerns that using a treatment of IFNalpha that impacts all cells as a benchmark for heterogeneous infection is not truly comparable. How was the concentration of IFN chosen? What was the extent of IFN activation in the culture?

      3) Further correlation of transcriptional changes with metabolic changes - While many published works emphasize transcriptional changes as a proxy for metabolic changes, there are robust methods that can be applied to directly analyze metabolite content and changes in the context of viral infection. In particular these studies should be assessed and compared for the interpretation of the presented results:

      Kramaric, T., Thein, O.S., Parekh, D. et al. SARS-CoV2 variants differentially impact on the plasma metabolome. Metabolomics 21, 50 (2025).

      Loveday EK, Welhaven H, Erdogan AE, Hain KS, Domanico LF, Chang CB, June RK, Taylor MP. Starve a cold or feed a fever? Identifying cellular metabolic changes following infection and exposure to SARS-CoV-2. PLoS One 2025 Feb 12;20(2):e0305065.

      Irún P, Gracia R, Piazuelo E, Pardo J, Morte E, Paño JR, Boza J, Carrera-Lasfuentes P, Higuera GA, Lanas A. Serum lipid mediator profiles in COVID-19 patients and lung disease severity: a pilot study. Sci Rep. 2023 Apr 20;13(1):6497.

      Luke Whiley, Nathan G. Lawler, Annie Xu Zeng, Alex Lee, Sung-Tong Chin, Maider Bizkarguenaga, Chiara Bruzzone, Nieves Embade, Julien Wist, Elaine Holmes, Oscar Millet, Jeremy K. Nicholson, and Nicola Gray, "Cross-Validation of Metabolic Phenotypes in SARS-CoV-2 Infected Subpopulations Using Targeted Liquid Chromatography-Mass Spectrometry (LC-MS)", Journal of Proteome Research 2024 23 (4), 1313-1327

      4) Editing to limit interpretation within experimental results. I appreciate that this is a stylistic concern and it is an issue in the paper. Statements in the results are often over-reaching. Some examples include: Line 156 -"suggesting attenuated or delayed early sensing" - The Low MOI and time leaves these results open to various explanations. Better to just state and move on.

      Line 157 "Delta drove the most extensive" - drove has a lot of assumption. "produced" "resulted in " or something more passive is more appropriate

      Line 179 "pointing to sustained suppression of interferon responses." - sustained is a leading interpretation. Effective? Comprehensive? again, the MOI is complicating interpretations of global transcript changes.

      Line 186 "suggesting a weaker activation of interferon signaling" Too much leading interpretation here. You detect fewer ISGs that are differentially regulated. Could be for many reasons.

      Minor Concerns:

      1) Line 72 "has evolved unique strategies" Unique can be easily misconstrued to mean different mechanisms. More likely, it is a subtle balance between promotion of viral replication and suppression of IFN responses.

      2) Line 126 - 128 "NECs were derived from three commercially available donor pools". The following text doesn't make it clear that they are the same produce from different lots. The methods clarify somewhat, but should be clarified for transparency.

      3) Line 129 "Viral replication kinetics" Need to highlight that this is detection of secreted viral genomes. which is a proxy measure for replication and dissemination in the culture. Direct measurement of the extent of infection is not being made nor can be interpreted.

      4) Line 149 "Differentially expressed genes (DEGs)" What is the comparison group? The figure legend/design suggests that IFNa treatment. Is there a matched uninfected control for each timepoint as well? Later experiments specify the comparison group. Text should be clarified here for transparency.

      5) Line 224 and Figure 4B - I don't see the value of the "merged NES" values given these are only aggregate of the Pre-Omicron and Omicron species. If you had compared multiple D614G and Delta variants, then there would be utility.

      6) Line 261 "quantified at 24 hpi" Why this timepoint? Changes were minor and not representative to extensive infection.

      7) Line 268 "rather than variation in cell number." I appreciate the rigor and control of experimentation. And how many of those cells are infected? That is not controlled.

      8) Line 428-429 "direction of regulation" This seems like an over-interpretation of the data. You have performed pathway analysis based on the quantity of RNA transcription detected in sequencing then imputing an interpretation of regulation. Without pulse labeling of metabolic standards or kinetic analysis of metabolite quantity, it is difficult to assert regulatory direction.

      Referee cross-commenting

      I am in agreement with the comments and suggestions of Reviewer #2 and #3. In particular, the comment of Reviewer #3 to estimate viral replication from the RNASeq data is quite valuable to begin addressing some of the concerns about the extent of viral replication. It does not negate the need to further assess productive viral titer (PFU/mL) or the extent of viral infection (immunofluorescence or flow cytometry).

      I also agree with Reviewer #3 regarding the extent of mechanistic interpretation that can be drawn from the current study. This concern can largely be addressed through revision of the text and a tempering of the interpretations that are drawn.

      I also agree and appreciate the detailed analysis of reviewer #2 regarding the inconsistencies between the text and the figures. It is critically important to be consistent in the data and presentation of these complex experiments. Resolving these issues will only strengthen the work.

      Significance

      • The work detailed in this manuscript is takes a very broad approach to identify differences in the effects of SARS-CoV-2 variant infections. Elements of this work have been published, including transcriptomics, metabolomics, and phosphoproteomics. This work is significant in that multiple variants are evaluated with comparable methods in the very relevant human nasal epithelial cell model. The use of this model, and the direct integration of multiple -omics, sets this work apart from previously published studies. This cross-omic analysis, with the IFN-treated controls, provides a robust foundation of data that can be used to detail the differences in the response to the SARS-CoV-2 variant infections.

      • That said, a significant limitation to the study was the low MOI used to initiate infection and the lack of detailed analysis infection progression of the different variants. Further, there is limited comparison of the IFN-treatment condition in relation to the transcriptional changes, and no inclusion of IFN-controls in the other methods. Both of these limitations undercut the potential significance of the paper and its findings.

      • Audience: This work will have be important to bench researchers interested in further characterizing and comparing the effects of SARS-CoV-2 infection. Potentially, clinicians involved in diagnostics will find utility in the study of changes for potential biomarker analysis for severe COVID19 disease.

      • My expertise is the field of virology, having studying multiple RNA and DNA viruses, including SARS-CoV-2, to understand virus-cell interactions. My focus includes primary cell culture models of infection, proteomic and metabolic analysis of infection induced changes, and monitoring the spread of viral infection through direct and indirect measurements.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Referees for their valuable input and critical review of our manuscript. Please find below our responses to the provided feedback.

      Referee # 1

      1.This is a well-done study. However, I ask to the Authors to include also figures showing the macroscopic in ovo evidence

      Response: Two photographs, each depicting an in situ xenograft tumour of the BxPC-3 and PANC-1 cell line have been added as a new figure (new Figure 1).

      and to discuss the role of the CAM assay in the study of xenografts.

      Response: In the introduction, we briefly explained in more detail the rationale behind the use of the CAM model for xenografting and pointed out an important limitation of the use of nucleoside labeling within this model.

      Referee # 2

      Major comments

      1.The manuscript's clinical relevance is limited, and methodological flaws prevent proper statistical validation of the in vitro findings.

      Response: We would like to clarify that the aim of this paper is to share some important preclinical scientific findings highly relevant for future translational studies. First, we demonstrate that cell cycle labelling can be performed in CAM model xenografts via the application of nucleoside analogues. Additionally, we provide evidence of a novel biological phenomenon that on itself could entail an additional important limitation to the use of nucleoside labelling in the CAM model. Additionally, we have only provided two images concerning in vitro culture of cell lines which serve as an illustration for the similarity to the in ovo growth pattern and thus are unsure what statistical validation is required for this aspect of the manuscript.

      2.The entire article is based on the assertion that "... " However, even here, the authors do not provide any evidence that these are erythrocytes and not some other cells.

      Response:

      We completely agree with the reviewer that not all chicken (embryonic) cells produce IgY. However, we would like to point out that the majority (if not all) IgY that is present within the developing embryo at the time of the CAM assay is derived originally from the hen and passed through to the embryo via the egg yolk in order to provide passive immunity. In fact, IgY production by embryo/chick cells is commonly observed only several days post hatching.1-3 Thus, IgY detection in chicken embryonic tissues remains species-specific. Though, it appears that large quantities of IgY are sequestered in the embryonic connective tissues. As such, this helps in segregation of 'chicken/stromal' pixels versus 'human/xenograft' pixels. In the revised manuscript, Figure 6 now includes quantitative data regarding the cellular origins with the different strategies described to substantiate our claims.

      Since application of this principle is of course not exclusively linked to FITC as a fluorophore, we performed additional immunofluorescent labelling on xenograft sections using Cy3-labelled Donkey anti-chicken IgY-antibodies as antibody staining to demonstrate the principle and reproducibility of using anti-IgY immunofluorescence. Additionally, using an anti-goat IgG-directed secondary antibody we demonstrate that, in the latter case, a completely different and non-specific staining pattern is obtained compared to that which is observed following anti chicken IgY labelling. Lastly, we added an extra image demonstrating the differences in staining pattern and intensity between xenograft cells and embryonic tissues and embryonic CAM epithelial cells. These images are provided as supplementary Figure S1.

      We have further nuanced the use of the secondary anti-chicken IgY labelling to the identification of chicken stroma rather than generalising the detection method to include specific labelling of all chicken cells. Though, in our experience, human cancer cells do not exhibit substantial levels of autofluorescence in the orange to far red spectrum within the xenografts. In contrast, some autofluorescence in the blue-green spectrum is observed for chicken tissues and some cells in non-immunolabelled sections of non-grafted embryos). Therefore, we reserve the spectral regions with the least amount of autofluorescence (i.e., orange-far red) for subsequent (IF)-labelling for xenograft cells in order to maintain sufficient specificity. In other words, the FITC-staining aids in determining whether a pixel is chicken in identity rather than human, in our tissue preparations.

      1. Dias da Silva, W. & Tambourgi, D. V. IgY: A promising antibody for use in immunodiagnostic and in immunotherapy. Vet. Immunol. Immunopathol. 135, 173-180 (2010).
      2. Ulmer-Franco, A. M. Transfer of Chicken Immunoglobulin Y (IgY) from the Hen to the Chick. Avian Biol. Res. 5, 81-87 (2012).
      3. Carlander, D., Wilhelmson, M. & Larsson, A. Immunoglobulin Y Levels in Egg Yolk From Three Chicken Genotypes. Food Agric. Immunol. 15, 35-40 (2003). Since application of this principle is of course not exclusively linked to FITC as a fluorophore, we performed additional immunofluorescent labelling on xenograft sections using Cy3-labelled Donkey anti-chicken IgY-antibodies as antibody staining to demonstrate the principle and reproducibility of using anti-IgY immunofluorescence. Additionally, using an anti-goat IgG-directed secondary antibody we demonstrate that, in the latter case, a completely different and non-specific staining pattern is obtained compared to that which is observed following anti chicken IgY labelling. Lastly, we added an extra image demonstrating the differences in staining pattern and intensity between xenograft cells and embryonic tissues and embryonic CAM epithelial cells. These images are provided as supplementary figure S1.

      We have further nuanced the use of the secondary anti-chicken IgY labelling to the identification of chicken stroma rather than generalising the detection method to include specific labelling of all chicken cells. Though, in our experience, human cancer cells do not exhibit substantial levels of autofluorescence in the orange to far red spectrum within the xenografts. In contrast, some autofluorescence in the blue-green spectrum is observed for chicken tissues and some cells in non-immunolabelled sections of non-grafted embryos). Therefore, we reserve the spectral regions with the least amount of autofluorescence (i.e., orange-far red) for subsequent (IF)-labelling for xenograft cells in order to maintain sufficient specificity. In other words, the FITC-staining aids in determining whether a pixel is chicken in identity rather than human, in our tissue preparations.

      3.The latter is actually confirmed by the authors in Fig. 7 in the form of the following sentence: "Highly autofluorescent (nucleated) embryonic erythrocytes can be observed throughout the tissue (arrowheads)." However, even here, the authors do not provide any evidence that these are erythrocytes and not some other cells.

      Response: We agree with the reviewer that the use of the green emission spectra in fluorescence should be used with caution, especially when evaluating tissue samples. In fact, this is the reason why we reserved the higher wavelength channels for anti-human fluorescent detection or the (click-based) detection of nucleosides.

      Indeed, we did not present full identification of these cells, but their avian origin is undisputed as these cells are also observed in the same quantity in tissues on non-grafted embryos. However, we now adapted the wording in the manuscript to the more general term 'chicken blood cells'. Throughout the manuscript and discussion section, we now elaborate on the possibility that nucleated cells, observed within blood vessels of the CAM can be erythrocytes, leukocytes or thrombocytes (which also are nucleated in avians).

      4. Tissue-specific and species-specific monoclonal antibodies to avian red cell nuclear proteins. Proceedings of the National Academy of Sciences of the United States of America, 79(20), 6265-6269. https://doi.org/10.1073/pnas.79.20.6265], and I recommend the authors use these instead of the fluorescein-labeled donkey anti-chicken IgY antibodies, which were misused. On the same matter, the article doesn't clarify if the antibodies for Ki67 and cyclin B1 can differentiate between human and chicken antigens.

      Response: Other groups have also applied this anti Ki-67 primary antibody in CAM-model studies without clear evidence of cross-reactivity to chicken embryonic nuclei.4,5 We have not observed ourselves any substantial species cross-reactivity for both the applied primary anti-Ki-67 antibody nor the primary anti-cyclin B1 which is also illustrated by the images provided within the manuscript.

      Jarrosson, L. et al. An avian embryo patient-derived xenograft model for preclinical studies of human breast cancers. iScience 24, 103423 (2021). Javed, S., Soukhtehzari, S., Fernandes, N. & Williams, K. C. Longitudinal bioluminescence imaging to monitor breast tumor growth and treatment response using the chick chorioallantoic membrane model. Sci. Rep. 12, 17192 (2022).

      5. Why is there no staining with antibodies to human antigens for tumor cell identification in Figures 1-2?

      Response: The aim of this figure was to demonstrate the highly pleiotropic nuclear morphology of the PANC-1 cell line compared to BxPC-3 cells while growing in vitro monocultures in turn comparing the growth pattern to that observed in.

      6. What specific markers in Fig. 4 should lead the reader to conclude that the nuclei pointed out by the arrows represent embryonic epithelial nuclei (EE), nuclei of chicken embryonic blood cells (EB), and human AsPC-1 tumor xenograft nuclei (T)?

      Response: The identification of different nuclei within these images was performed on the basis of their morphological aspects (i.e. irregular shape, larger size...) in combination with the relative localisation of these nuclei within the tissue section and the typical nucleolar staining pattern. The latter is not observed in any of the human xenografts we routinely perform. Nevertheless, we agree that accurate discrimination between embryonic epithelial nuclei and embryonic blood cells cannot be guaranteed with absolute certainty in the absence of the use of additional markers. However, the morphological aspects combined with the anti-Ki-67 staining, strongly suggest the human identity of the annotated nuclei. We have modified Figure 4 (now Figure 3) to include Anti-Ki67 staining. Additionally, an annotation that was erroneously pointing out a mitotic figure was omitted. For the remaining mitotic nuclei, clear perichromosomal localisation of Ki-67 is observed which supports the claim that these are mitotic nuclei.

      7. The nuclei shown in this same figure, which supposedly display mitotic figures of dividing tumor nuclei and can also be clearly distinguished (M), actually more closely resemble giant multinucleated cells.

      Response:

      We have added the immunofluorescent staining for Ki-67 of these images (new Figure 3); these nuclei present with different patterns for the marker, which would be highly unlikely in a single multinucleated cell. Further, we have altered the description of embryonic epithelial cells to the more general term embryonic cells (E). Additionally, annotation of embryonic cells was omitted from the lower panels in order to draw the focus on the polymorphic nature of the human tumour nuclei rather than the embryonic surrounding cells.

      8. The statement "After immunolabeling for human Ki67, we confirmed that all EdU+ xenograft nuclei were also Ki67+, confirming the specificity and compatibility of both labeling strategies in CAM xenograft tumor cells (Figure 5)" is not supported by the image, which shows that the majority of Ki67-positive nuclei are EdU-negative.

      Response:

      During cell cycle progression, nuclear levels of Ki-67 gradually increase during S-phase and peak during mitosis. As a consequence, all S-phase cells (EdU+ nuclei) will present detectable nuclear levels of Ki-67. The reverse is not necessarily true: the nuclei of cells that are within G2 and mitosis during the nucleoside labelling will present with high levels of nuclear Ki-67 but will not show nuclear incorporation for EdU as these have already completed replication during the S-phase. Figure 4 now shows quantification of both markers across 7 different tumour BxPC-3 xenografts. With the applied classification strategy, 94% of the detected EdU+ nuclei were also Ki-67+. Irrespective of Ki-67 positivity, 44% of BxPC-3 nuclei were calculated to be EdU+ for the labelling duration of 1 hour.

      9. The caption for Figure 6 needs to be revised, particularly the statement "Combined, these markers allow in ovo segregation of proliferating tumor cells into early S-phase (ES, EdU+), late S or early G2 (LS, EdU+CB1+), and G2 (EdU-CB1+)".

      Response: We have altered the phrasing to stress the cytoplasmic presence of cyclin B1 as indicative for late S-phase and G2 phase.

      1. What do the pink cell nuclei in Figures 8 and 9 represent?

      Response: The orange immunofluorescent staining in these images demonstrates anti-Ki-67 labelling. Due to overlap with the nuclear (blue) staining the appearance may have a pink undertone but this does not alter the interpretability of the images.

      1. In Fig. 10, it is impossible to see the yellow line, which, according to the authors' statement ". yellow lines indicate the area classified as tumor", should indicate the tumor origin of the cells!

      Response: In the submitted PDF version of the manuscript, a yellow line is clearly visible in panels C and F. Perhaps due to compression-related quality loss, its presence is more difficult to see. We now submitted a higher resolution image.

      1. There is a distinct lack of evidence in Figures 11A-C suggesting that the S-phase nuclei are attributed to both embryonic liver epithelial cells (ES) and BxPC-3 cells (BS). Furthermore, there is no evidence that the apparent cytoplasmic EdU inclusions (arrowheads) belong specifically to chicken embryonic cells.

      Response:

      Concerning the identity of the cells in Figure 11 A-B (Now Figure 7). We would like to clarify that these images are taken from embryos which had not been subjected to tumour grafting. Therefore, the presence of any human tumour cells within these liver sections can 100% be excluded. This has now also been stressed in the text and the image caption. With respect to Figure 11C (now 7C), we have noticed that this image contained a tissue processing artefact which may lead some readers to question its authenticity. Therefore, we have replaced this panel (7C, now 7C) with another image taken from the same tissue section and included the anti-IgY staining channel to allow identification of CAM tissue.

      We have addressed the confusion regarding the apparent different magnifications in the figure legend: all images presented in Figure 7 were acquired using a 40x magnification objective. In order to focus on some select regions, different digital zoom levels are present for each panel. To account for this, each panel is now annotated with its own scale bar.

      1. As no macroscopic images depicting tumor nodules from the implantation of AsPC-1 and PANC-1 tumor cells into the CAM were provided.

      Response: A new figure (now Figure 1) has been added with macroscopic images of PANC-1 and BxPC-3 tumours in situ at ED14. Successful tumour grafting of AsPC-1 cells was demonstrated via the histological images provided throughout the manuscript.

      1. A newly identified biological phenomenon: non-nuclear EdU accumulation in chicken embryonic cells," is not well supported unless compelling evidence is presented to establish that these cells indeed belong to chicken embryos

      Response: We have clarified in the body of the text as well as in various figure legends that the apparent nucleoside presence within the cytoplasm of cells has been consistently observed also in liver sections taken from embryos that have not received tumour grafts.

      15. Lacking any demonstration of concurrent EdU accumulation alongside cytoplasmic and/or membrane staining within the same cells. The latter is quite feasible by staining the cells with a suitable agent (or fluorophore), such as...

      Response: We agree with the reviewer that we have not performed dedicated staining for cytoplasmic or membrane components in order to demonstrate the colocalisation of the EdU signal with the . Though we believe that the simultaneously acquired brightfield images are sufficiently convincing that these signals localise to the cytoplasm rather than to the extracellular space. Additional experiments were performed with F-ara-EdU and BrdU/IdU labelling in non-grafted embryos which demonstrated the robustness of these findings and further point towards cytoplasmic rather than extracellular signal. Multiple new figures and paragraphs were added into the revised manuscript which support our claim.

      1. Third, regarding "Extranuclear EdU staining": Extranuclear EdU staining is not a standard or typical...

      Response: We agree that this is not the intended use of nucleoside incorporation assays (including halogenated analogues). With this manuscript we aimed to illustrate the unexpected findings that could lead to misinterpretation of experimental data as a consequence of a new biological phenomenon.

      16.1 Cellular damage or death: Cells that are dying or have been damaged may release their DNA, causing it to be detected outside the nucleus.

      Response: The cells with apparent extranuclear (F-ara-)EdU, and BrdU and IdU) show no signs of nuclear pyknosis nor DNA fragmentation. In addition, we also describe this phenomenon in healthy developing embryos at various stages of development, in livers that show no macroscopic signs of tissue damage. Moreover, if the extranuclear EdU detection does signify degraded DNA (with the incorporated analogue), the remaining nuclear DNA would also show signs of (replication-dependent) EdU incorporation.

      16.2 Apoptosis: During apoptosis, cells undergo DNA degradation, and some DNA fragments may be found in the extranuclear space

      Response: We would like to refer the reviewer to the reply above. It seems unlikely that from ED14 onwards such massive apoptotic evens would be taking place in healthy embryos. Moreover, nucleoside incorporation and (caspase-dependent) DNA fragment generation would need to take place within one hour (= a typical nucleoside labelling period).

      16.3 Technical artifacts: Errors during cutting of formalin-fixed tissues, cell fixation, permeabilization, or the staining procedure itself can lead to mislocalization of the fluorescent signal.

      Response: We agree that several technical reasons can lead to altered subcellular localisation during the detection of some (protein) markers. Iduring detection.6,7 In our experiments, tissues were fixed in ice-cold 4% formalin for at least 12 hours. Therefore, it is highly unlikely that insufficient fixation would have caused this phenomenon.

      Regarding the possibility that the staining procedure itself can lead to mislocalisation, it is important to note that the copper-catalysed click detection of EdU is not susceptible to many of the possible causes for altered epitope detection through standard immuno-labelling. We can also not conceive how the click-reaction, performed on formalin-fixed tissues, could induce a complete shift in EdU from the nuclear to the extranuclear compartment of these select cells only without any other alterations in the nuclear or cellular morphology. We have performed several control staining procedures, such as performing the click reaction on tissue sections of embryos that had not received any alkyne-containing nucleoside labelling. This did not result in the detection of any signal. These results are referred to in the manuscript.

      Lastly, the simultaneous presence of correctly localised (nuclear) (F-ara-)EdU and 'incorrectly' localised (F-ara-)EdU within the same tissue section, and across different tissue types (liver versus CAM xenograft) demonstrates that it is unlikely that the observed phenomenon is the result of a technical artifact.

      Yoshida, S. R., Maity, B. K. & Chong, S. Visualizing Protein Localizations in Fixed Cells: Caveats and the Underlying Mechanisms. J. Phys. Chem. B 127, 4165-4173 (2023). Stadler, C., Skogs, M., Brismar, H., Uhlén, M. & Lundberg, E. A single fixation protocol for proteome-wide immunofluorescence localization studies. J. Proteomics 73, 1067-1078 (2010).

      16.4 Unique cell types: In some specific cell types, such as megakaryocytes, DNA or other nuclear components may be localized extranuclearly, leading to extranuclear staining with techniques like EdU.

      Response: Referring to the paper of Lan et al. (2019; doi: 10.1111/acel.12901) to which the reviewer alluded by citing the following statement "Nicked DNA was strongly visible in old cells, prominently in the cytosol, but undetectable in young cells, and was more intense in old cells upon" , we would like to stress that these authors investigated the presence of dsDNA in various human cell lines in vitro through OTHER assays than nucleoside labelling. Additionally, in the cited paper, also no specific staining for cytoplasm or cell membranes was performed to accurately segment the nucleus versus cytoplasm. Lastly, the intense focal-like nucleoside signal we observe within our tissue sections does not resemble the ambiguous and apparently random signal localisation within the cytoplasm of the cells presented in the cited paper. Lastly, the reviewer seems not to be aware of the fact that megakaryocytes dot NOT occur in avian species.

      16.5 Extranuclear accumulation of histones and nucleosomes is an early event of apoptosis in human lymphoblasts (https://doi.org/10.1136/ard.2003.011452)

      Response: The referred paper investigated cellular processes during cell death (apoptosis). Unfortunately, it does not present microscopic images to compare our data to; we therefore cannot assess its relevance to our findings.

      16.6 In some specific cell types, such as megakaryocytes, DNA or other nuclear components may be localized extranuclearly, leading to extranuclear staining with techniques like EdU [Frydman, G.H., Tessier, S.N., Wong, K.H.K. et al. Megakaryocytes

      Response: As already mentioned above, to our knowledge, megakaryocytes have NOT been described or identified in avian species so far. Therefore, we disagree that the cited phenomenon/reference would be highly relevant for nucleoside accumulation in the cytoplasm of non-megakaryocyte cells in in ovo model systems. Furthermore, the paper by Frydman et al., does not describe the use of nucleoside labelling in these cells.

      16.7 However, since the authors do not present any absolute markers proving that EdU-cyto+ cells are EdU+ chicken granulocytes, the authors' statement that "Given the high number of EdU+ granulocytes observed, it is more likely that these are neutrophils rather than eosinophils" appears highly speculative.

      Response: We agree with the reviewer that our claim that EdU-cyto+ cells represent granulocytes is still speculative. However, concerning the exact wording; brightfield images did show colocalisation of the nucleoside signal within the cytoplasm of granule-containing chicken cells. We have emphasized more clearly within the discussion section that the extranuclear nucleoside signal is highly unlikely to be nucleoside-containing DNA. Additional experiments were conducted to investigate whether these cells represent chicken thrombocytes, known to possess phagocytotic functionality. IF staining for CD41/CD61 as a marker for thrombocytes revealed that CD41/CD61-positive cells do not exhibit the alluded phenomenon. A new figure, Figure 10 illustrates this finding.

      Methodological ambiguities

      1. In the section "Nucleoside labeling of dividing cells in the CAM model": It is not clear when BrdU and EdU were given after tumor cell implantation, or how to standardize the distance from the tumor as suggested by "...as far away from the visible tumor as possible...". I question whether such a "precise" description of the application site would contribute "...to ensure low variability in labeling duration...", especially when it was performed by two independent researchers "...in tandem, ensuring that an experimental group of 16 embryos could be labeled in less than five minutes."

      Response:

      In this manuscript, we have described nucleoside labelling in grafted embryos as well as in non-grafted embryos. Alongside depictions of nucleoside labelling, the embryonic day at which the assay was performed is mentioned in the figure legend of all figures demonstrating nucleoside labelling. For example, BrdU labelling of the BxPC-3 xenograft presented in Figure 2 (revised manuscript) was performed at ED13 (which is 5 days following the grafting procedure at ED7). In the revised manuscript, we have ensured that for each labelling depicted, its timing in the developmental period is explicitly mentioned. Nucleoside labelling in non-grafted embryos was performed throughout various stages of development; the timing of the assay is now also mentioned in the corresponding figure legends.

      We fully agree that there is still room for improvement concerning the standardisation of the application of the labelling solution onto the CAM. The main aim of the present manuscript, however, is to demonstrate the feasibility of this labeling protocol. In addition, we report the unexpected but important finding of cytoplasmic accumulation due to a possible biological cause. In this respect we rephrased the text and pointed to a potential pitfall of the application in the CAM model so that future studies can anticipate to misinterpretations.

      1. In the section "Immunofluorescence labeling of cryosections: "...a blocking and permeabilizing solution in a humidified atmosphere."" ... : Absence of detergent is used during incubation with primary and secondary antibodies (even the widely recommended Tween20!). I wonder whether it might be the source of non-specific tissue labelling by donkey anti-chicken IgY that was considered as an evidence of chicken origin.

      Response: We interpreted this as that the reviewer concluded that we applied our primary and secondary antibody staining procedures in the presence of Triton X-100. We can assure that following the initial blocking incubation step and three rinses with PBS, all subsequent antibody labelling procedures (primary or secondary) are performed in the absence of detergent within the staining buffer. This was mentioned in the second paragraph under the heading "__Immunofluorescence __labelling of cryosections" __in the Materials & Methods section.

      Referee #3

      Major comments

      1. The current manuscript lacked figure panels to show the quantification results of the labeling to convincedly demonstrate the positive correlation between the nucleoside labeling versus Ki67. Bar charts with statistical analysis should be included. The study utilized 2 human cancer cell lines as an example. T

      Response: We have added quantification data of EdU and Ki-67 across several BxPC-3 xenografts and added statistical analysis in Figure 4E.

      1. The study would be improved if the authors could include cancer cell lines that are known to be highly proliferative vs slowly proliferative to demonstrate the robustness of this method.

      Response: We do not expect failure of nucleoside labelling due to tumour cell-line dependent characteristics for several reasons. For instance, when performing EdU labelling in vitro, initial dose-and duration titration are recommended to establish satisfactory labelling efficiency for the research hypothesis in question. In our research, we are predominantly focusing on relatively short-lived effects of treatments on S-phase progression and therefore prefer to apply labelling (in vitro and in ovo) for durations of less than 60 minutes. In contrast, extended labelling durations of several hours will allow one to capture replication even in very slowly proliferating (cancer) cells at the cost of temporal resolution. Importantly, the growth rates of individual cell lines within the CAM model may differ from their behaviour in vitro. Additionally, long duration nucleoside (EdU) labelling may induce additional toxicity. Alternative analogues such as F-ara-EdU with improved safety profiles for long labelling durations should then be considered. In the revised manuscript, labelling with BrdU/IdU as well as F-ara-EdU of non-grafted chicken tissues were added which demonstrate the robustness of the application of the technique in the CAM model. Longer-term labelling of lowly proliferative cells can be achieved with F-ara-EdU to minimise embryo (and tumour cell) toxicity which will extend the application potential of the described .

      In summary, labelling parameters should be optimised for each specific research question, but nucleoside incorporation remains a gold standard for identifying S-phase progression across diverse cell types. Given that we successfully established its feasibility in the CAM model, there is no biological reason to suggest it would not be equally valid for any proliferating cancer cell line.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study explored the utilization of nucleoside labelling in the human cancer cell xenografts grown in the chick CAM. The authors tried 2 different reagents (BrdU and EdU) which would label S-phase proliferative cells in a species non-specific way. The authors concluded that EdU labeling reliably detected Ki67 positive cells compared to BrdU labeling which would detect non-Ki67 positive cells as well (might be due to dsDNA denaturing). In order to distinguish the human versus the chick cells, the authors further utilized chicken-specific antibody anti-IgY and the segmentation algorithm in QPath to distinguish cells of the two species. This allows the authors to develop (supervised) automated or manual annotation for individual cell detection, in this case the proliferating Ki67+ human cancer cells. The study also showed an unexpected finding of cytoplasmic positive EdU cells in the embryonic chicken liver, which the authors speculated to be non-antigen presenting granulocytes.

      Major comments:

      This study provided a good methodology path to analyze and to quantify proliferating human cancer cells inside the CAM xenograft. The current manuscript lacked figure panels to show the quantification results of the labeling to convincedly demonstrate the positive correlation between the nucleoside labeling versus Ki67. Bar charts with statistical analysis should be included. The study utilized 2 human cancer cell lines as an example. The study would be improved if the authors could include cancer cell lines that are known to be highly proliferative vs slowly proliferative to demonstrate the robustness of this method.

      Minor comments:

      The authors used parentheses for the subheadings inside the discussion section. This is not an usual practice unless it is required by the journal formatting requirement.

      Significance

      General assessment: The strength of this study is to develop a straight forward solution to detect human-specific proliferative cancer cells inside the chicken CAM xenograft. The data presented in the IF staining and segmentation results were clear. The limitation is that the study only tested 2 human cancer cell lines and it is unknown whether the current method is robust enough in a pan-cancer setting.

      Advance: This study showed good advances in the methodology for the CAM xenograft field.

      Audience: The audience of this study would include researchers in cancer biology, imaging processing, and 3R.

      My expertise: cancer biology, CAM, systems and spatial biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      An intriguing article, "Application of Nucleoside Analogue Labelling to Study the Cell Cycle of Xenografted PDAC Cell Lines in the Chorioallantoic Membrane Model," explores the overlooked potential of the chorioallantoic membrane (CAM) model as an innovative alternative in the realm of cell cycle research related to tumor xenografts. The authors were resolute in their pursuit of evidence to bolster their fascinating hypothesis, which posits that exposure to nucleosides enhances the labelling of nucleosides with 5-ethynyl-2'-deoxyuridine (EdU). This improvement allows effective multiplexing with cell-cycle markers like cyclin B1 and Ki67, especially when combined with advanced digital image analysis. They claimed that accurately separating human and chicken cells allowed them to identify a specific type of chicken embryonic cell with a high level of extranuclear EdU accumulation. Researchers propose that cells without chicken MHC II are likely to be non-proliferating granulocytes found in the embryonic liver of both grafted and non-grafted embryos, including xenograft tissues. According to the authors, the CAM xenograft model effectively helps in studying tumor cell cycles in live conditions. Behind, authors claim a novel biological phenomenon, namely that of extranuclear nucleoside accumulation in certain chicken embryonic cells. If the authors successfully prove their hypothesis, it will significantly confirm CAM as a unique in vivo tool for better immuno-/histological exams and more precise cell cycle assessments compared to standard rodent models. The research encourages greater use of this alternative animal model in cancer studies to help us understand the regulation of the cancer cell cycle. This, in turn, may improve the implementation of existing treatment methods or uncover potential vulnerabilities in the cancer cell cycle.

      Unfortunately, in this article, the authors do not provide sufficiently convincing evidence of the hypothesis postulated by them. The manuscript's clinical relevance is limited, and methodological flaws prevent proper statistical validation of the in vitro findings. My questions and remarks below are designed to find undeniable proof or opposing views that back the authors' key perspective.

      Major - Conceptual:

      Point 1 - In Result section: The entire article is based on the assertion that "... distinguishing between both species can already be achieved to a large extent in a relatively simple manner through the use of anti-chicken IgY fluorophore-conjugated antibodies, which are routinely used only as secondary antibodies. In doing so, cellular as well as acellular chicken embryonic components are stained (Figure 7)." This assumption seems off, as anti-chicken IgY staining is employed in research to detect chicken antibodies (IgY). This secondary antibody is designed to specifically bind to the IgY immunoglobulin in chickens, targeting both heavy and light chains. A donkey anti-chicken IgY antibody labels chicken tissues to identify cells that express chicken IgY immunoglobulins. It's clear that not every cell in a chicken embryo tissues produces IgY immunoglobulins. Hence, cells stained by donkey anti-chicken IgY FITC-conjugated antibody cannot represent all chicken tissue cells. Lastly, the use of the green fluorescence channel is often unfounded, considering that there are always cells with non-specific and considerably high fluorescence levels in this area of the spectrum. The latter is actually confirmed by the authors in Fig. 7 in the form of the following sentence: "Highly autofluorescent (nucleated) embryonic erythrocytes can be observed throughout the tissue (arrowheads)." However, even here, the authors do not provide any evidence that these are erythrocytes and not some other cells.

      Point 2 - In their article, the authors consistently emphasize how their method excels in differentiating human tumor cells from the abundant chicken embryonic cells that encase the tumor. Nonetheless, the authors fail to present direct evidence regarding the identity of the nuclei or cells in question, nor do they substantiate the validity of the algorithm selected for processing digital images of fluorescently labeled cells, which would be essential for discerning their origin, whether chicken or human. Given the importance of the authors' subsequent conclusions, such evidence should be provided, at least for initial validation. Antibodies for chicken nuclear antigens are well established [e.g., Kane, C. M., Cheng, P. F., Burch, J. B., & Weintraub, H. (1982). Tissue-specific and species-specific monoclonal antibodies to avian red cell nuclear proteins. Proceedings of the National Academy of Sciences of the United States of America, 79(20), 6265-6269. https://doi.org/10.1073/pnas.79.20.6265], and I recommend the authors use these instead of the fluorescein-labeled donkey anti-chicken IgY antibodies, which were misused. On the same matter, the article doesn't clarify if the antibodies for Ki67 and cyclin B1 can differentiate between human and chicken antigens. Why is there no staining with antibodies to human antigens for tumor cell identification in Figures 1-2? What specific markers in Fig. 4 should lead the reader to conclude that the nuclei pointed out by the arrows represent embryonic epithelial nuclei (EE), nuclei of chicken embryonic blood cells (EB), and human AsPC-1 tumor xenograft nuclei (T)? The nuclei shown in this same figure, which supposedly display mitotic figures of dividing tumor nuclei and can also be clearly distinguished (M), actually more closely resemble giant multinucleated cells. However, their affiliation with either chicken or human tumor cells is neither obvious nor proven. The statement "After immunolabeling for human Ki67, we confirmed that all EdU+ xenograft nuclei were also Ki67+, confirming the specificity and compatibility of both labeling strategies in CAM xenograft tumor cells (Figure 5)" is not supported by the image, which shows that the majority of Ki67-positive nuclei are EdU-negative. The caption for Figure 6 needs to be revised, particularly the statement "Combined, these markers allow in ovo segregation of proliferating tumor cells into early S-phase (ES, EdU+), late S or early G2 (LS, EdU+CB1+), and G2 (EdU-CB1+)". The authors' method for labeling cells should consider that the location of cyclin B1 is key to determining a cell's stage in the cell cycle: o Cytoplasmic: Associated with G2/M arrest. o Nuclear: Associated with the transition into mitosis and the G2 to M phase. The assertion that "Species-distinction can be further facilitated by combining anti-chicken IgY IF (Figure 8C) with additional anti-human IF, such as anti-human Ki67 (Figure 8D)" lacks clarity, as the images in Figure 8 do not support the authors' claim. What do the pink cell nuclei in Figures 8 and 9 represent? In Fig. 10, it is impossible to see the yellow line, which, according to the authors' statement "... yellow lines indicate the area classified as tumor", should indicate the tumor origin of the cells! There is a distinct lack of evidence in Figures 11A-C suggesting that the S-phase nuclei are attributed to both embryonic liver epithelial cells (ES) and BxPC-3 cells (BS). Furthermore, there is no evidence that the apparent cytoplasmic EdU inclusions (arrowheads) belong specifically to chicken embryonic cells. Furthermore, in Fig. 11 B and C, the equal magnification of 40x is apparently incorrectly written.

      Discussion section:

      The initial assertion made by the authors, "In this paper, we have demonstrated that our previously published protocol concerning the xenografting of the BxPC-3 cell line10 can be applied to the AsPC-1 and PANC-1 cell lines," seems inadequately supported, as no macroscopic images depicting tumor nodules from the implantation of AsPC-1 and PANC-1 tumor cells into the CAM were provided.

      The authors' second assertion, "A newly identified biological phenomenon: non-nuclear EdU accumulation in chicken embryonic cells," is not well supported unless compelling evidence is presented to establish that these cells indeed belong to chicken embryos. Moreover, the assertion regarding non-nuclear EdU accumulation seems to be speculative, lacking any demonstration of concurrent EdU accumulation alongside cytoplasmic and/or membrane staining within the same cells. The latter is quite feasible by staining the cells with a suitable agent (or fluorophore), such as fluorophore-conjugated Phalloidin for the cytoskeleton or PKH26 (https://www.sigmaaldrich.com/RU/en/product/sigma/pkh26gl?srsltid=AfmBOorEbKBTeYSCKZY6qs-pWjCZCg4lhOvNqE0YYByS2A545f-POa24) for membrane structures.

      Third, regarding "Extranuclear EdU staining": Extranuclear EdU staining is not a standard or typical application for the EdU (5-ethynyl-2′-deoxyuridine) assay, which is designed to label and detect newly synthesized DNA in the nucleus during the S-phase of the cell cycle. Extranuclear EdU staining likely refers to a misinterpretation or unusual experimental result where EdU or its detection reaction product is found outside the nucleus. This could be due to cell damage, processing issues, tissue cutting artefacts or a cell type with unique DNA localization, as seen in some contexts of apoptosis or extranuclear DNA accumulation. Potential reasons for extranuclear EdU staining may include, but not limited to:

      • Cellular damage or death: Cells that are dying or have been damaged may release their DNA, causing it to be detected outside the nucleus. • Apoptosis: During apoptosis, cells undergo DNA degradation, and some DNA fragments may be found in the extranuclear space.
      • Technical artifacts: Errors during cutting of formalin-fixed tissues, cell fixation, permeabilization, or the staining procedure itself can lead to mislocalization of the fluorescent signal.
      • Unique cell types: In some specific cell types, such as megakaryocytes, DNA or other nuclear components may be localized extranuclearly, leading to extranuclear staining with techniques like EdU. Several examples:
      • Nicked DNA was strongly visible in old cells, prominently in the cytosol, but undetectable in young cells, and was more intense in old cells upon induction of DNA damage by the DNA damaging agent cytarabine/Ara‐C which causes DSBs [Lan YY, Heather JM, Eisenhaure T, Garris CS, Lieb D, Raychowdhury R, Hacohen N. Extranuclear DNA accumulates in aged cells and contributes to senescence and inflammation. Aging Cell. 2019 Apr;18(2):e12901. doi: 10.1111/acel.12901].
      • Extranuclear accumulation of histones and nucleosomes is an early event of apoptosis in human lymphoblasts. [Gabler, C., Blank, N., Hieronymus, T., Schiller, M., Berden, J. H., Kalden, J. R., & Lorenz, H. M. (2004). Extranuclear detection of histones and nucleosomes in activated human lymphoblasts as an early event in apoptosis. Annals of the rheumatic diseases, 63(9), 1135-1144. https://doi.org/10.1136/ard.2003.011452]
      • In some specific cell types, such as megakaryocytes, DNA or other nuclear components may be localized extranuclearly, leading to extranuclear staining with techniques like EdU [Frydman, G.H., Tessier, S.N., Wong, K.H.K. et al. Megakaryocytes contain extranuclear histones and may be a source of platelet-associated histones during sepsis. Sci Rep 10, 4621 (2020). https://doi.org/10.1038/s41598-020-61309-3].

      I would, if I could, strongly concur with the statements made by the authors: "It is also evident that the presence of EdU-cyto+ cells is not confined to the liver, as they were also found within the CAM. These cells likely represent a ubiquitously distributed cell population present in both healthy and xenografted chicken embryos," and "...since EdU-cyto+ cells are consistently present in non-grafted embryos as well, the phenomenon is not triggered by xenografting or the presence of PDAC cells." However, since the authors do not present any absolute markers proving that EdU-cyto+ cells are EdU+ chicken granulocytes, the authors' statement that "Given the high number of EdU+ granulocytes observed, it is more likely that these are neutrophils rather than eosinophils" appears highly speculative.

      Fourth, regarding Ki67 staining: Careful consideration and robust validation are essential when drawing conclusions and interpretations about Ki67's role in cell proliferation and the cell cycle. Research shows that Ki-67, an important cell cycle marker, has two main splice variants, α and β, which are regulated differently in normal and cancer cells at mRNA and protein levels. Moreover, Ki-67 undergoes constant regulation and degradation through the proteasome system in both cell types, indicating a dynamic control mechanism for this protein. It was also observed a putative extranuclear elimination pathway of Ki-67, where it is transported to the Golgi apparatus. Furthermore, the unforeseen extranuclear removal of Ki-67 strongly indicates the necessity to examine this protein beyond the confines of the "nuclear box," a perspective that has been overlooked until now [see e.g., Chierico L, Rizzello L, Guan L, Joseph AS, Lewis A, Battaglia G (2017) The role of the two splice variants and extranuclear pathway on Ki-67 regulation in non-cancer and cancer cells. PLoS ONE 12(2): e0171815. https://doi.org/10.1371/journal.pone.0171815].

      Last but not least, regarding the authors' conclusion: "We report a novel phenomenon: the apparent cytoplasmic accumulation of EdU in nondividing chicken granulocytes." The phenomenon of DNA replication taking place in the cytoplasm is not a novel observation. In instances like replication stress, the cytosol can initiate a response pathway that encompasses the detection of cytosolic DNA and subsequent signaling processes focused on genome protection. Cytosolic DNA generated after replication stress activates a Ca2+-dependent pathway to protect stalled replication forks [Li, S., Lu, H. T., & You, Z. (2025). Cytosolic DNA and intracellular Ca2+: Maintaining genome stability during replication stress. DNA repair, 152, 103877. https://doi.org/10.1016/j.dnarep.2025.103877]. Damage to the DNA template caused by environmental pollutants, like radiation and genotoxic chemicals, can impede the replication process.Physiological stressors also affect fork dynamics, including metabolic byproducts like reactive oxygen species, conflicts in replication and transcription, repetitive DNA elements such as telomeres, sequences that form secondary structures, DNA-RNA hybrids, misincorporated ribonucleotides, and low availability of DNA precursors. In response to these challenges, cells have developed an intricate network of surveillance and repair mechanisms. They identify replication stress, support stalled forks, fix problems, and allow replication to proceed. These pathways are crucial for maintaining genome stability and ensuring proper cellular function. Several excellent reviews on the topic included [M.R. Higgs BOD1L is required to suppress deleterious resection of stressed replication forks Mol. Cell (2015) R. Kumar et al. RIF1: a novel regulatory factor for DNA replication and DNA damage response signaling DNA Repair(2014) W. Leung ATR protects ongoing and newly assembled DNA replication forks through distinct mechanisms Cell Rep.(2023) M.B. Adolph et al. Mechanisms and regulation of replication fork reversal DNA Repair (Amst. )(2024) Z. You et al. The role of single-stranded DNA and polymerase alpha in establishing the ATR, Hus1 DNA replication checkpoint J. Biol. Chem.(2002) A. Kumagai TopBP1 activates the ATR-ATRIP complex Cell(2006) J. Lee et al. The Rad9-Hus1-Rad1 checkpoint clamp regulates interaction of TopBP1 with ATR J. Biol. Chem.(2007)]. To validate the authors' conclusions regarding the presence or absence of the declared phenomenon, I would recommend modulating TREX1 expression. Overexpression of TREX1, a nuclease that degrades cytosolic DNA, suppresses TRPV2-mediated Ca2+ release under replication stress. TREX1 depletion, however, leads to cytosolic DNA accumulation.

      Methodological ambiguities: In the section "Nucleoside labeling of dividing cells in the CAM model": It is not clear when BrdU and EdU were given after tumor cell implantation, or how to standardize the distance from the tumor as suggested by "...as far away from the visible tumor as possible...". I question whether such a "precise" description of the application site would contribute "...to ensure low variability in labeling duration...", especially when it was performed by two independent researchers "...in tandem, ensuring that an experimental group of 16 embryos could be labeled in less than five minutes."

      In the section "Immunofluorescence labeling of cryosections": "...a blocking and permeabilizing solution in a humidified atmosphere." This solution is comprised of final concentrations of 0.1% Triton-X-100, 0.02% sodium azide, 5% horse serum (Sigma-Aldrich, Cat#H1270), 0.01% thimerosal, and 0.3% bovine serum albumin (Sigma-Aldrich, Cat#A7284) in PBS at pH 7.4." Absence of detergent is used during incubation with primary and secondary antibodies (even the widely recommended Tween20!). I wonder whether it might be the source of non-specific tissue labelling by donkey anti-chicken IgY that was considered as an evidence of chicken origin.

      Significance

      General asssesment:

      The research has technical significance, and encourages greater use of this alternative animal model in cancer studies to help us understand the regulation of the cancer cell cycle. This, in turn, may improve the implementation of existing treatment methods or uncover potential vulnerabilities in the cancer cell cycle. No clinical significance so far. Certan groups conducting cancer biology studies and modelling might be interested in implementation of the described method, although it will be challenging without macroscopic evidence of the expected tumor nodules.The data and the methods presented in such a way that reproducing them might be challenging.The experiments are not adequately replicated and statistical analysis inadequate.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      By using the CAM assay, the Authors of this studyshow that for the BxPC-3 and AsPC-1 cell lines, nucleoside labelling with 5-ethynyl-2'-deoxyuridine (EdU) can be multiplexed successfully with other cell-cycle markers such as cyclin B1 and Ki67, especially when combined with digital image analysis techniques. Starting from ED14, they observe the presence of a chicken embryonic cell type that appears to possess a high-quantity of extranuclear accumulation of EdU. Initial assessment of these cells showed that they are likely granulocytes which can be found in the embryonic liver of grafted and non-grafted embryos, as well as in xenograft sections. These cells do not express chicken MHC II, in turn making it less likely that they represent professional antigen presenting cells.

      Significance

      Remarks. This is a well-done study. However, I ask to the Authors to include also figures showing the macroscopic in ovo evidence and to discuss the role of the CAM assay in the study of xenografts.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their positive and constructive feedback.

      We apologise for the delay in coming back. The first author has moved to the LMB, and the Trost lab has been relocating to the University of Manchester, which delayed our ability to respond quickly.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Reviewer Comments

      The manuscript by Chatterjee et al. describes a novel ultra-sensitive isolation and deep proteomics workflow to investigate phagosome dynamics of bacterium-containing phagosomes. The method enables dual proteome coverage of both host and pathogen, and the authors report quantitative changes in the host and bacterial proteomes using Salmonella isogenic mutants defective in intracellular survival. They further leverage these datasets to assess the relevance of selected Salmonella genes in intracellular fitness.

      Overall, the manuscript presents a powerful and technically impressive approach that will be of significant interest to the infection biology community. The study is well conceived and addresses an important gap in the field. However, several clarifications and additions would strengthen the work and improve interpretability of the results.

      Specific Comments

      Line 76: The authors should consider including the following relevant citations: PMID: 30079117 and PMID: 31009521.

      We thank the reviewer for pointing this out. We have now included the suggested references


      Line 104: Please define the abbreviation BFP clearly upon first use.

      We thank the reviewer; we have defined the abbreviation upon first instance.

      Figure 1A, Step 2: From the schematic, it is unclear whether the pellet or the supernatant is used for the subsequent step in which the CellVue dye is added. Please clarify.

      We thank the reviewer for bringing this to our attention. We have now modified Figure 1A.

      Figure 1B: It would be informative to report the percentage of S. Typhimurium that are double positive, especially in the BFP + Claret condition. A small bar plot for each condition would help visualize and compare the proportion of Claret-labelled bacteria.

      We have now included a figure for the percentage of BFP + Claret for STM in S1H.

      Figure 1C: The distinction between the upper and lower images is unclear. Do they represent different particles or different fields of view of the same sample? Please clarify.

      They both are from different fields of view.

      Line 122: The statement is not entirely accurate. Cells that lyse via pyroptosis will leave behind cellular remnants, including nuclei, that may still co-sediment with intact cells in such preparations.

      We have modified the sentence accordingly.

      Line 128: CellVue and Claret appear to be used interchangeably-are they the same reagent? Please clarify and use consistent terminology throughout.

      We have rectified this inconsistency in our revised manuscript.

      Line 136: Please explain the basis for the stated estimates. If this is common knowledge within the field, additional explanation would still be helpful for non-experts.

      We have clarified this further in the manuscript. Obviously, these numbers are estimates but give the reader an idea with how little material we are working.

      Lines 143 & 145: Please define "protein IDs" and indicate how many correspond to host proteins versus Salmonella proteins.

      We have defined this in our revised manuscript. Also, to avoid any confusion, these proteomics methods were optimised using a commercially available HeLa protein digest, and hence no Salmonella proteins are detected here.

      Figure 2D: Please specify the number and type of replicates used. Also indicate the plot type (e.g., violin plot) and the statistical test used to determine significance.

      We have updated figure legend for 2D and 2E stating the number of biological replicates, i.e. n=4 and n=3.

      Line 244: Please consider citing PMID: 32514074 and PMID: 23162002.

      *We have included these references. *

      Line 253: Have the authors considered how their observations regarding MHC relate to prior findings (PMID: 27832589)?

      *Thank you for suggesting this paper and we enjoyed reading it. However, since the paper suggested by the reviewer focusses on cell surface MHC molecules and we are looking at the phagolysosomal compartment, we feel it may be difficult to make connections. *

      Line 265: Clarify which "cell" is being referred to-the host cell or the bacterial cell.

      We have modified the sentence to reduce confusion.

      Line 278: Have the authors considered how their observations on glycolytic proteins relate to earlier work (PMID: 19380470 and PMID: 37594988)?

      *Thank you for pointing out these papers. We have cited both of these and added another sentence that intracellular STM utilises host metabolites. *

      Line 285: The claim that "PhoP-dependent effectors actively remodel..." requires clarification. If the authors are referring to all PhoP-regulated genes as "effectors," this terminology may cause confusion, as "effectors" in the Salmonella field typically denotes T3SS-secreted proteins. While some T3SS effectors are PhoP-regulated, PhoP controls many additional genes, and the observed phenotypes may reflect broader defects in intracellular survival rather than absence of secreted effectors specifically. Rewording is recommended.

      Thank you for your suggestion, we have modified the same in text.

      Line 313: Have the authors examined later time points (e.g., 8 hpi), when the SCV is more established and SPI-2 effector expression is higher?

      We did not test the 8 hpi timepoint because our primary aim was to identify the induction of SPI-2 effectors at earlier stages. Testing later timepoints would be problematic, as PhoP mutants show poor survival at these times, which would confound comparisons between STM WT and PhoP mutants.

      Line 317: Were secreted SPI-2 effectors detectable using PhagoCyt, and if so, how did they behave?

      We detected some of the secreted effectors as well, and they are in accordance with the literature. As expected, most of them were detected only in WT at 4 hpi.

      For example, PipB2, SseL and SctB1 are significantly decreased in the PhoP mutant compared to the STM WT at 4 hpi.

      Line 319: Have the candidate Salmonella mutants been evaluated at later time points (6-8 hpi)? Stronger phenotypic differences may emerge when intracellular replication relies more heavily on SPI-2 function.

      We acknowledge that there may be larger differences at later time points; However, we wanted to be comparable with the data within the manuscript, i.e. within the 4 hour time-point that we have kept throughput. Moreover, at later timepoint we see increase macrophage cell death and therefore refrain from doing timepoints much longer after the 4 hour mark.

      Figure 5B: For all mutant strains, please also report in vitro growth to determine whether the phenotypes reflect general growth defects or are specific to the intracellular environment.

      We have performed the growth curve for the PhoP mutant, which is in the supplemental figure 1.

      Line 336: As above, please reconsider the use of the term "effectors." Unless evidence is provided that these are bona fide secreted SPI-2 effectors, an alternative term would avoid confusion.

      We have modified the sentence to reduce confusion.

      Supplementary Figure 5: The volcano plots appear pixelated. Please provide higher-resolution versions.

      Thank you for pointing this out. We have rectified this.

      Reviewer #1 (Significance (Required)):

      General assessment:

      This study introduces a highly sensitive dual host-pathogen proteomics workflow for profiling bacterium-containing phagosomes. Its key strengths are the technical innovation and the mechanistic insight gained using Salmonella mutants. The main areas needing improvement are clarification of methodological details and tighter interpretation of some biological claims.

      Advance:

      To my knowledge, this is the first study to achieve such deep, simultaneous proteomic coverage of both host and intracellular bacteria within purified phagosomes. This represents a notable technical advance and provides new mechanistic insight into intracellular adaptation and immune regulation.

      Audience:

      The work will interest a specialized audience in infection biology, host-pathogen interactions, and proteomics, with broader relevance for researchers studying organelle isolation or intracellular pathogens. The workflow and datasets will be useful as a resource for future studies.

      Reviewer expertise:

      Expertise in host-pathogen interactions, bacterial intracellular survival, macrophage biology, and functional proteomics. Limited expertise in MS instrumentation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this work, Chatterjee, Rubio and colleagues use a novel flow cytometry-based method to isolate phagosomes from Salmonella infected macrophages. This method is applied both to wild-type and to a mutant (deletion of phoP) that does not express virulence genes, prior to the proteome characterization of these phagosomes and the bacteria that they contain. The experiments were done at an early point of infection (30 min) and a later time point (4 h). The authors first identified mitochondrial proteins in their analysis, which had previously been considered contaminants from the preparation of phagosomes. However, some Salmonella effector proteins are known to affect mitochondria, and the authors demonstrate that inhibition of Complex I showed decreased Salmonella intracellular viability. Comparing WT and the phoP mutant also highlighted two Salmonella proteins that enhance intracellular survival. In addition, the authors show that their method recapitulates previously known proteins involved in Salmonella infection. The study is well designed and clearly written.

      I have only some minor comments that I hope will strengthen the work:

      It would be interesting to compare the results with a whole cell proteome analysis, and to other approaches that involve subcellular fractionation (both in the context of Salmonella infection) to: a) highlight proteins that are specifically changing in abundance in the phagosomes (but not necessarily in the cell), and b) to show that this approach is able to capture previously unknown phenomena. To avoid the performing additional experiments, the authors can compare their dataset to previous proteomic datasets of Salmonella infection. We have compared this with the ultracentrifugation methods STM WT 4h vs STM WT uptake (Figure 6A).

      A color scale for the heatmap in Fig 2C is needed. I assume that this heatmap shows intensity and not fold-changes, and thus suggest that the authors use a single-color gradient for easier visualization.

      *This has now been included. *

      Best regards,

      André Mateus

      Reviewer #2 (Significance (Required)):

      General assessment: This study provides a novel approach to study intracellular pathogenic bacteria. The method is applied to Salmonella, but can potentially be used for any bacteria, including non-genetically tractable organisms. A strength of the approach is that it captures the bacterial proteome, which is mostly undetectable when studying infected cells. Further, by enriching phagosomes, it allows measuring the spatial distribution of proteins to these organelles. The study could be improved by distinguishing proteome changes that are caused by trafficking of proteins to phagosomes vs general changes in protein abundance.

      Advance: Apart from a new methodology, the authors use the approach to identify novel aspects of Salmonella infection biology, e.g., the importance of mitochondrial proteins in host defense or novel Salmonella proteins that are involved in intracellular survival. Audience: The audience for this study is mostly those in the field of infection biology, particularly Salmonella. The dataset generated can be used to identify novel aspects of Salmonella infection, and the described method could be applied to other pathogens.

      My field of expertise: Proteomics, microbiology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In the manuscript "Flow cytometry-based isolation of Salmonella-containing phagosomes combined with ultra-sensitive proteomics reveals novel insights into host-pathogen interactions", the authors describe a new method for analysis of composition pathogen-containing phagosomes and the pathogens within. Combination of FACS-based single phagosome analysis and sorting combined with optimised highly sensitive proteomic analysis of sorted vesicles has potential for identification of so far overlooked host-pathogen interactions. Although this is well described in the manuscript, some controls are missing.

      Major comments:

      1) The sorting of labelled bacteria is a crucial bottleneck in the whole procedure. The gating strategy presented in the Fig. 1B suggest that the initial "bacterial phagosome size" is limited from the bottom based on the noise signal but not from top. Therefore any not broken THP-1 cell remaining in the sample would be also included in the analysis. In respect to very high sensitivity of the mass spectrometry procedure and high abundance of housekeeping genes in host cells, this contamination could well explain the appearance of mitochondria, ribosome, and nuclear envelope proteins identified in Fig 2B and undermine the following results. Therefore, the gating strategy should be more stringent and data from this more stringent gating shall be compared with the current data sets. Since the authors use BFP+ Salmonella and do not analyse the claret+BFP- events, a BFP vs FSC gating step could help to distinguish free bacteria, bacteria in vesicles, and not or only partially broken host cells.

      We use a series of centrifugations to ensure that we do not have intact cells in the prepared samples. We have also visualised the final samples under the microscope and did not observe any intact cells. Because of the side/forward scatter gating, intact cells are not within the field of sorting. In Figure 1B we show that free bacteria are not within the gating strategy that we used. Finally, we visually inspected >100 pictures of sorted phagosomes by imaging flow cytometry and did not see any intact cells or free bacteria.

      2) Since the authors present data previously well accepted as contaminations from other fractions, these shall be carefully validated by other methods. For example the contact of mitochondria with SCV could be validated using a FRET- or split FP- based assays. Change of abundance of surface proteins on SCV in individual timepoints shall be validated using antibody-based flow cytometry on isolated SCVs. Most relevant antibodies are already described in the manuscript or available commercially (IL4R, IFNgR, integrins, TLRs). Microscopy-based quantification could help with the soluble proteins present within SCVs.

      We agree with the reviewer that this would be very interesting. However, we feel that this is outside of the scope of this paper and will be very laborious and time consuming, practically a whole project in itself.

      3) Since the authors describe an alternative method to methods used previously, they shall discuss the differences in results obtained by the formerly used methods.

      We have now provided a dataset that is with SCVs isolated using ultracentrifugation as a comparatively analysis to our method (Figure S6A and Table S8). __The data show that the ultracentrifugation-isolated phagosomes have many more proteins from any organelle (__Figure S6B), suggesting that they are less pure than the phagosomes isolated by the PhagoCyt approach.

      4) Only 15 Salmonella proteins downregulated between 0.5 and 4 h timepoints were identified. However, at least genes from SPI-1 and flagella would be expected to be downregulated at 4 h p.i. How do the authors explain this discrepancy? In contrast, are the SPI-2 genes among those identified as upregulated?

      In our supplementary table 6 (comparison between WT 4h vs WT uptake), we see that there are 458 Salmonella proteins that are only present in uptake samples, these were not included in limma analysis since they are completely absent in the WT 4h. We decided to report these as “unique” proteins rather than perform imputation. In Figure 5B, we specifically highlight STM proteins down-regulated, which include flagellar proteins and SPI-1 proteins.

      To answer your second question, yes, several SPI-2 genes (effectors and other regulatory proteins) are upregulated at 4 hpi. 131 Salmonella proteins are significantly upregulated, and 55 proteins are exclusively present in the WT 4hpi samples. Some selected examples are in Figure 5A.

      Minor comments:

      1) Fig 1, the figure caption seems to remain parts of an older version, mentioning blue bars not present in the current version?

      The figure caption appears to be correct for us; the “blue” is in the unstained BFP Salmonella, which is hidden behind the purple, which is the BFP Salmonella + CellVue Claret.

      2) Fig 1A point 1, how were the dead cells removed? Normal centrifugation is not able to discriminate dead and living cells well enough as percoll gradient centrifugation for example would be. Such gradient centrifugation is not mentioned in the Methods section though.

      We have not used Percoll-based centrifugation to remove dead cells; instead, we have washed the adherent macrophages in dishes 3-4 times with ice-cold PBS to remove dead, floating cells, and then washed the pellet several times with PBS to ensure we are not taking any dead cells into the sample preparation.

      3) Fig 1A point 2, did the authors check for the composition of the pellet fraction in each centrifugation step? What are the losses and cross contaminations of the other fraction?

      No, we have not checked the composition of each fraction using mass spec; however, we did run some western blots to correctly identify the major organelle contribution in each fraction.

      4) Suppl. Fig 1, caption for panels F and G are missing. The axis in the panel G is misleading - the bacteria obtained in "output" contain proliferating intracellular bacteria that originate only from a fraction of the "input" bacteria. Since the figure clearly show increase in the number of intracellular bacteria and all the extracellular bacteria should be killed by gentamicin, all bacteria in the "output" probably proliferate intracellularly and, therefore, originate from the same fraction of the "input" throughout the whole assay. Showing these results as CFU per well/plate/surface area or cell count would be more exact, in this case the "input" data shall be shown as a separate data point.

      We thank the reviewer for this observation. We have now modified the figure legends. These are normalised per cell, and we think they provide accurate results.

      5) Fig 1B, could the authors show the percentages in individual quadrants for the green "Sample with BFP Salmonella + claret"?

      Yes, there is the plot that depicts the percentage in Supplementary Figure 1H, this varies between WT and PhoP mutant, and hence, we decided to not show this in one figure.


      6) All proteins identified as significantly up or down represented shall be listed in a supplementary file.

      They are listed in the supplemental tables.

      7) Fig 2C suggests that some mitochondrial proteins are similarly present at the SCV containing WT Salmonella at 4h as ∆phoP mutant at 0.5 h p.i. Could the authors speculate how is that? The scale of blue/orange transition shall be shown in Fig 2C.

      We speculate that Salmonella WT alters the maturation of the SCVs is heavily arrested by the pathogen and hence resemble the early SCV of a mutant that is unable to arrest the SCV degradation stages.

      8) In the Fig 2D, the authors show decrease of CFU obtained from THP-1 cells treated with Rotenone. However, rotenone is known to induce host cell apoptosis. Were the presented data normalized to amount of living host cells in the sample? For example measurement of protein concentration in the sample lysate after washing away the dying host cells should enable this.

      Yes, we have normalised the data to the account for the percentage of live cells using live dead staining. However, in the timepoints used, we did not observe significant cell death.

      9) Microscopy-based observation of mitochondria relocation to SCVs in time shall strengthen the claim that mitochondria-derived ROS are involved in anti-Salmonella host defense.

      There are multiple literature PMID: 38356294, PMID: 41444067, PMID: 15866946, PMID: 41198672 that support our data in this regard.

      10) The Salmonella proteins identified in the Fig 5 shall be validated using qPCR.

      We think that data from qPCR would not be accurate to validate Salmonella proteins, as it has been shown that Salmonella mRNAs can have sub-minute half-lives (PMID: 38527194). We used rather conservative proteomics analysis settings, that have shown in a recent pre-print of our lab to have 0% false discoveries and 0.4% false quantitative rate ( https://doi.org/10.1101/2025.09.22.677725). We acknowledge that another reviewer did not find this experiment to be essential.

      Reviewer #3 (Significance (Required)):

      The manuscript was reviewed mainly from the Salmonella and flow cytometry/FACS expertise point of view. The main interest in the study lies within its methodological advances - combination of single vesicle analysis using flow cytometry/FACS with highly sensitive mass spectrometry analysis. In comparison to other similar studies in the field, this combination significantly expands the possibilities of sorting of distinct subpopulations of vesicles from the same cells. This will make the article of interest to scientists in the broad field of host-pathogen interactions and immunology.

      **Referee cross-commenting**

      Reviewer 3 - @Reviewer #1: I see your point and leave it at the editors to judge how important this comment is. My reasoning was this: Fig 5

      serves as a proof of concept that PhagoCyt has the power to make new discoveries in Salmonella biology. While behavior of some of the proteins

      shown if Fig 5 is well described (e.g. flagella or SPI-1 T3SS components and effectors), some are novel and to prove the functionality of the

      method, these results should be confirmed by some other well accepted mean. Given the great sensitivity of PhagoCyt, other proteomic

      approaches are unlikely to help in this case (e.g. flagella or SPI-1 T3SS components and effectors are not detectable by western blot at 4 h p.i.).

      Therefore, I suggest qPCR (but would accept any other method as well) as a very sensitive and well accepted approach, but leave at the authors

      to chose what proteins they want to use for the validation.

      Reviewer 1- I agree with comments raised by the other two reviewers, except the following point from Reviewer 3 '10) The Salmonella proteins

      identified in the Fig 5 shall be validated using qPCR.' It is not clear which proteins are being referred to and it is unclear to this reviewer how this

      experiment(s) would improve the manuscript in its current form.

      Reviewer 3- I agree with all comments raised.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the manuscript "Flow cytometry-based isolation of Salmonella-containing phagosomes combined with ultra-sensitive proteomics reveals novel insights into host-pathogen interactions", the authors describe a new method for analysis of composition pathogen-containing phagosomes and the pathogens within. Combination of FACS-based single phagosome analysis and sorting combined with optimised highly sensitive proteomic analysis of sorted vesicles has potential for identification of so far overlooked host-pathogen interactions. Although this is well described in the manuscript, some controls are missing.

      Major comments:

      1. The sorting of labelled bacteria is a crucial bottleneck in the whole procedure. The gating strategy presented in the Fig. 1B suggest that the initial "bacterial phagosome size" is limited from the bottom based on the noise signal but not from top. Therefore any not broken THP-1 cell remaining in the sample would be also included in the analysis. In respect to very high sensitivity of the mass spectrometry procedure and high abundance of housekeeping genes in host cells, this contamination could well explain the appearance of mitochondria, ribosome, and nuclear envelope proteins identified in Fig 2B and undermine the following results. Therefore, the gating strategy should be more stringent and data from this more stringent gating shall be compared with the current data sets. Since the authors use BFP+ Salmonella and do not analyse the claret+BFP- events, a BFP vs FSC gating step could help to distinguish free bacteria, bacteria in vesicles, and not or only partially broken host cells.
      2. Since the authors present data previously well accepted as contaminations from other fractions, these shall be carefully validated by other methods. For example the contact of mitochondria with SCV could be validated using a FRET- or split FP- based assays. Change of abundance of surface proteins on SCV in individual timepoints shall be validated using antibody-based flow cytometry on isolated SCVs. Most relevant antibodies are already described in the manuscript or available commercially (IL4R, IFNgR, integrins, TLRs). Microscopy-based quantification could help with the soluble proteins present within SCVs.
      3. Since the authors describe an alternative method to methods used previously, they shall discuss the differences in results obtained by the formerly used methods.
      4. Only 15 Salmonella proteins downregulated between 0.5 and 4 h timepoints were identified. However, at least genes from SPI-1 and flagella would be expected to be downregulated at 4 h p.i. How do the authors explain this discrepancy? In contrast, are the SPI-2 genes among those identified as upregulated?

      Minor comments:

      1. Fig 1, the figure caption seems to remain parts of an older version, mentioning blue bars not present in the current version?
      2. Fig 1A point 1, how were the dead cells removed? Normal centrifugation is not able to discriminate dead and living cells well enough as percoll gradient centrifugation for example would be. Such gradient centrifugation is not mentioned in the Methods section though.
      3. Fig 1A point 2, did the authors check for the composition of the pellet fraction in each centrifugation step? What are the losses and cross contaminations of the other fraction?
      4. Suppl. Fig 1, caption for panels F and G are missing. The axis in the panel G is misleading - the bacteria obtained in "output" contain proliferating intracellular bacteria that originate only from a fraction of the "input" bacteria. Since the figure clearly show increase in the number of intracellular bacteria and all the extracellular bacteria should be killed by gentamicin, all bacteria in the "output" probably proliferate intracellularly and, therefore, originate from the same fraction of the "input" throughout the whole assay. Showing these results as CFU per well/plate/surface area or cell count would be more exact, in this case the "input" data shall be shown as a separate data point.
      5. Fig 1B, could the authors show the percentages in individual quadrants for the green "Sample with BFP Salmonella + claret"?
      6. All proteins identified as significantly up or down represented shall be listed in a supplementary file.
      7. Fig 2C suggests that some mitochondrial proteins are similarly present at the SCV containing WT Salmonella at 4h as ∆phoP mutant at 0.5 h p.i. Could the authors speculate how is that? The scale of blue/orange transition shall be shown in Fig 2C.
      8. In the Fig 2D, the authors show decrease of CFU obtained from THP-1 cells treated with Rotenone. However, rotenone is known to induce host cell apoptosis. Were the presented data normalized to amount of living host cells in the sample? For example measurement of protein concentration in the sample lysate after washing away the dying host cells should enable this.
      9. Microscopy-based observation of mitochondria relocation to SCVs in time shall strengthen the claim that mitochondria-derived ROS are involved in anti-Salmonella host defense.
      10. The Salmonella proteins identified in the Fig 5 shall be validated using qPCR.

      Referee cross-commenting

      Reviewer 3 - @Reviewer #1: I see your point and leave it at the editors to judge how important this comment is. My reasoning was this: Fig 5 serves as a proof of concept that PhagoCyt has the power to make new discoveries in Salmonella biology. While behavior of some of the proteins shown if Fig 5 is well described (e.g. flagella or SPI-1 T3SS components and effectors), some are novel and to prove the functionality of the method, these results should be confirmed by some other well accepted mean. Given the great sensitivity of PhagoCyt, other proteomic approaches are unlikely to help in this case (e.g. flagella or SPI-1 T3SS components and effectors are not detectable by western blot at 4 h p.i.). Therefore, I suggest qPCR (but would accept any other method as well) as a very sensitive and well accepted approach, but leave at the authors to chose what proteins they want to use for the validation.

      Reviewer 1- I agree with comments raised by the other two reviewers, except the following point from Reviewer 3 '10) The Salmonella proteins identified in the Fig 5 shall be validated using qPCR.' It is not clear which proteins are being referred to and it is unclear to this reviewer how this experiment(s) would improve the manuscript in its current form.

      Reviewer 3- I agree with all comments raised.

      Reviewer 2- I agree with the other reviewer's comments/suggestions.

      Significance

      The manuscript was reviewed mainly from the Salmonella and flow cytometry/FACS expertise point of view. The main interest in the study lies within its methodological advances - combination of single vesicle analysis using flow cytometry/FACS with highly sensitive mass spectrometry analysis. In comparison to other similar studies in the field, this combination significantly expands the possibilities of sorting of distinct subpopulations of vesicles from the same cells. This will make the article of interest to scientists in the broad field of host-pathogen interactions and immunology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this work, Chatterjee, Rubio and colleagues use a novel flow cytometry-based method to isolate phagosomes from Salmonella infected macrophages. This method is applied both to wild-type and to a mutant (deletion of phoP) that does not express virulence genes, prior to the proteome characterization of these phagosomes and the bacteria that they contain. The experiments were done at an early point of infection (30 min) and a later time point (4 h). The authors first identified mitochondrial proteins in their analysis, which had previously been considered contaminants from the preparation of phagosomes. However, some Salmonella effector proteins are known to affect mitochondria, and the authors demonstrate that inhibition of Complex I showed decreased Salmonella intracellular viability. Comparing WT and the phoP mutant also highlighted two Salmonella proteins that enhance intracellular survival. In addition, the authors show that their method recapitulates previously known proteins involved in Salmonella infection. The study is well designed and clearly written.

      I have only some minor comments that I hope will strengthen the work:

      1. It would be interesting to compare the results with a whole cell proteome analysis, and to other approaches that involve subcellular fractionation (both in the context of Salmonella infection) to: a) highlight proteins that are specifically changing in abundance in the phagosomes (but not necessarily in the cell), and b) to show that this approach is able to capture previously unknown phenomena. To avoid the performing additional experiments, the authors can compare their dataset to previous proteomic datasets of Salmonella infection.
      2. A color scale for the heatmap in Fig 2C is needed. I assume that this heatmap shows intensity and not fold-changes, and thus suggest that the authors use a single-color gradient for easier visualization.

      Best regards, André Mateus

      Referee cross-commenting

      Reviewer 3 - @Reviewer #1: I see your point and leave it at the editors to judge how important this comment is. My reasoning was this: Fig 5 serves as a proof of concept that PhagoCyt has the power to make new discoveries in Salmonella biology. While behavior of some of the proteins shown if Fig 5 is well described (e.g. flagella or SPI-1 T3SS components and effectors), some are novel and to prove the functionality of the method, these results should be confirmed by some other well accepted mean. Given the great sensitivity of PhagoCyt, other proteomic approaches are unlikely to help in this case (e.g. flagella or SPI-1 T3SS components and effectors are not detectable by western blot at 4 h p.i.). Therefore, I suggest qPCR (but would accept any other method as well) as a very sensitive and well accepted approach, but leave at the authors to chose what proteins they want to use for the validation.

      Reviewer 1- I agree with comments raised by the other two reviewers, except the following point from Reviewer 3 '10) The Salmonella proteins identified in the Fig 5 shall be validated using qPCR.' It is not clear which proteins are being referred to and it is unclear to this reviewer how this experiment(s) would improve the manuscript in its current form.

      Reviewer 3- I agree with all comments raised.

      Reviewer 2- I agree with the other reviewer's comments/suggestions.

      Significance

      General assessment: This study provides a novel approach to study intracellular pathogenic bacteria. The method is applied to Salmonella, but can potentially be used for any bacteria, including non-genetically tractable organisms. A strength of the approach is that it captures the bacterial proteome, which is mostly undetectable when studying infected cells. Further, by enriching phagosomes, it allows measuring the spatial distribution of proteins to these organelles. The study could be improved by distinguishing proteome changes that are caused by trafficking of proteins to phagosomes vs general changes in protein abundance.

      Advance: Apart from a new methodology, the authors use the approach to identify novel aspects of Salmonella infection biology, e.g., the importance of mitochondrial proteins in host defense or novel Salmonella proteins that are involved in intracellular survival.

      Audience: The audience for this study is mostly those in the field of infection biology, particularly Salmonella. The dataset generated can be used to identify novel aspects of Salmonella infection, and the described method could be applied to other pathogens.

      My field of expertise: Proteomics, microbiology.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Chatterjee et al. describes a novel ultra-sensitive isolation and deep proteomics workflow to investigate phagosome dynamics of bacterium-containing phagosomes. The method enables dual proteome coverage of both host and pathogen, and the authors report quantitative changes in the host and bacterial proteomes using Salmonella isogenic mutants defective in intracellular survival. They further leverage these datasets to assess the relevance of selected Salmonella genes in intracellular fitness. Overall, the manuscript presents a powerful and technically impressive approach that will be of significant interest to the infection biology community. The study is well conceived and addresses an important gap in the field. However, several clarifications and additions would strengthen the work and improve interpretability of the results.

      Specific Comments

      Line 76: The authors should consider including the following relevant citations: PMID: 30079117 and PMID: 31009521. Line 104: Please define the abbreviation BFP clearly upon first use. Figure 1A, Step 2: From the schematic, it is unclear whether the pellet or the supernatant is used for the subsequent step in which the CellVue dye is added. Please clarify. Figure 1B: It would be informative to report the percentage of S. Typhimurium that are double positive, especially in the BFP + Claret condition. A small bar plot for each condition would help visualize and compare the proportion of Claret-labelled bacteria. Figure 1C: The distinction between the upper and lower images is unclear. Do they represent different particles or different fields of view of the same sample? Please clarify. Line 122: The statement is not entirely accurate. Cells that lyse via pyroptosis will leave behind cellular remnants, including nuclei, that may still co-sediment with intact cells in such preparations. Line 128: CellVue and Claret appear to be used interchangeably-are they the same reagent? Please clarify and use consistent terminology throughout. Line 136: Please explain the basis for the stated estimates. If this is common knowledge within the field, additional explanation would still be helpful for non-experts. Lines 143 & 145: Please define "protein IDs" and indicate how many correspond to host proteins versus Salmonella proteins. Figure 2D: Please specify the number and type of replicates used. Also indicate the plot type (e.g., violin plot) and the statistical test used to determine significance. Line 244: Please consider citing PMID: 32514074 and PMID: 23162002. Line 253: Have the authors considered how their observations regarding MHC relate to prior findings (PMID: 27832589)? Line 265: Clarify which "cell" is being referred to-the host cell or the bacterial cell. Line 278: Have the authors considered how their observations on glycolytic proteins relate to earlier work (PMID: 19380470 and PMID: 37594988)? Line 285: The claim that "PhoP-dependent effectors actively remodel..." requires clarification. If the authors are referring to all PhoP-regulated genes as "effectors," this terminology may cause confusion, as "effectors" in the Salmonella field typically denotes T3SS-secreted proteins. While some T3SS effectors are PhoP-regulated, PhoP controls many additional genes, and the observed phenotypes may reflect broader defects in intracellular survival rather than absence of secreted effectors specifically. Rewording is recommended. Line 313: Have the authors examined later time points (e.g., 8 hpi), when the SCV is more established and SPI-2 effector expression is higher? Line 317: Were secreted SPI-2 effectors detectable using PhagoCyt, and if so, how did they behave? Line 319: Have the candidate Salmonella mutants been evaluated at later time points (6-8 hpi)? Stronger phenotypic differences may emerge when intracellular replication relies more heavily on SPI-2 function. Figure 5B: For all mutant strains, please also report in vitro growth to determine whether the phenotypes reflect general growth defects or are specific to the intracellular environment. Line 336: As above, please reconsider the use of the term "effectors." Unless evidence is provided that these are bona fide secreted SPI-2 effectors, an alternative term would avoid confusion. Supplementary Figure 5: The volcano plots appear pixelated. Please provide higher-resolution versions.

      Referee cross-commenting

      Reviewer 3 - @Reviewer #1: I see your point and leave it at the editors to judge how important this comment is. My reasoning was this: Fig 5 serves as a proof of concept that PhagoCyt has the power to make new discoveries in Salmonella biology. While behavior of some of the proteins shown if Fig 5 is well described (e.g. flagella or SPI-1 T3SS components and effectors), some are novel and to prove the functionality of the method, these results should be confirmed by some other well accepted mean. Given the great sensitivity of PhagoCyt, other proteomic approaches are unlikely to help in this case (e.g. flagella or SPI-1 T3SS components and effectors are not detectable by western blot at 4 h p.i.). Therefore, I suggest qPCR (but would accept any other method as well) as a very sensitive and well accepted approach, but leave at the authors to chose what proteins they want to use for the validation.

      Reviewer 1- I agree with comments raised by the other two reviewers, except the following point from Reviewer 3 '10) The Salmonella proteins identified in the Fig 5 shall be validated using qPCR.' It is not clear which proteins are being referred to and it is unclear to this reviewer how this experiment(s) would improve the manuscript in its current form.

      Reviewer 3- I agree with all comments raised.

      Reviewer 2- I agree with the other reviewer's comments/suggestions.

      Significance

      General assessment:

      This study introduces a highly sensitive dual host-pathogen proteomics workflow for profiling bacterium-containing phagosomes. Its key strengths are the technical innovation and the mechanistic insight gained using Salmonella mutants. The main areas needing improvement are clarification of methodological details and tighter interpretation of some biological claims.

      Advance:

      To my knowledge, this is the first study to achieve such deep, simultaneous proteomic coverage of both host and intracellular bacteria within purified phagosomes. This represents a notable technical advance and provides new mechanistic insight into intracellular adaptation and immune regulation.

      Audience:

      The work will interest a specialized audience in infection biology, host-pathogen interactions, and proteomics, with broader relevance for researchers studying organelle isolation or intracellular pathogens. The workflow and datasets will be useful as a resource for future studies.

      Reviewer expertise:

      Expertise in host-pathogen interactions, bacterial intracellular survival, macrophage biology, and functional proteomics. Limited expertise in MS instrumentation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all three reviewers for their careful and constructive engagement with our manuscript. We are encouraged by their overall positive assessment of the work. Reviewer 1 described this as "an important study" that addresses a significant gap in understanding systemic, inter-organ responses to hypoxia, and noted the potential relevance of our findings to mammalian IL-6 biology. Reviewer 2 highlighted the study as being of "high significance" and described it as "a foundation study that will be the motivation for numerous high-impact papers in the future", noting its broad relevance to understanding hypoxia in both health and disease. In the revised manuscript, we have addressed all of the reviewers' comments and critiques. This includes performing several new experiments, expanding our Discussion, and making a number of clarifications to the text, figures, and methods as detailed below.

      Reviewer #1


      __(Evidence, reproducibility and clarity (Required)): __The authors describe a role of Unpaired 3 (Upd3) in tissue communication in responses to hypoxia in Drosophila adult flies. Upd3 mRNA is strongly upregulated in hypoxia, along with well-characterized JAK/STAT downstream target genes, in both adult fly males and females, as well as in larvae. Interestingly, adult females but not males require Upd3 for 15 to 24 h survival in hypoxia, as Upd3 mutant females but not males die to a much larger proportion in these conditions. Adult females they display strong hypoxic upregulation of Upd3 in the gut, assessed by RT-PCR or through a Gal4 transcriptional reporter, mainly in epithelial enterocytes. Enterocyte-specific RNAi-mediated KD indicated that this enterocyte expression of Upd3 represents about 40% of Upd3 expression in the whole body. Enterocyte-specific KD of Upd3 in adult females significantly reduced survival in hypoxia, suggesting that this expression is critical for hypoxic adaptation. Tissue-specific analysis of the expression of the STAT target genes, SOCS36E, TotA and TotM revealed that stimulation of the JAK/STAT pathway in hypoxia is widespread, although more pronounced in abdominal tissues. Indeed, overexpression of Upd3 in enterocytes provokes upregulation o both target genes TotA and TotM. Consistent with this RNAi-dependent inhibition of the JAK/STAT pathway in the fat body and oenocytes significantly reduced survival of female flies in hypoxia. Nitric oxide synthase (NOS) is strongly upregulated in adult female abdomens upon hypoxic exposure, and KD of NOS in fat body and oenocytes reduced hypoxic survival. Surprisingly, the found that ubiquitous KD of HIFa/Sima led to mitigation of Upd3 hypoxic induction and, more clearly, to JAK/STAT target gene induction. HIF KD flies displayed increased lethality in hypoxia, and this lethality was slightly mitigated in Upd3 heterozygous flies. The authors conclude that increased lethality of HIF-minus flies in hypoxia stems at least in part from excessive levels of Upd3. The authors then find that HIF/Sima-dependent inhibition of Upd3 expression is non-cell autonomous, since KD of Sima specifically in the gut does not affect expression of Upd3 in this organ. Instead, Sima KD at the fat body led to significant increase of Upd expression in the gut, suggesting that a Sima-born signal communicates these two organs, leading to restriction of Upd3 intestinal expression. ROS does not seem to be the signal that communicates the fat body with the gut, as expression of catalase in the fat body did not affect expression of Upd3 in the gut.

      (Significance (Required)): This is an important study, because most previous studies have focused on cell-autonomous responses to hypoxia, but much less is known about systemic responses to low oxygen conditions, particularly in relation to inter-organ communication during this responses. This work defines the cytokine unpaired 3, homolog of human interleukin 6, as a major regulator of systemic responses to hypoxia. Future studies will determine if interleukin 6 plays similar roles in mammals. This work might be of interest for a broad audience interested in responses to hypoxia, as well as general physiology.

      We thank Reviewer 1 for their careful reading and comments on the manuscript. We are pleased that they found this to be "an important study" that addresses a gap in understanding systemic, inter-organ responses to hypoxia. We have addressed each of their concerns in the revised manuscript as outlined below.

      __MAJOR CONCERNS __ 1) Figure 1 lacks statistical analysis. It is important to determine if the apparent differences in gene expression are statistically significant.

      We have now added the statistical analyses to the revised version of the figures.

      2) Is NOS expression in fat body/oenocytes JAK/STAT-dependent? Block the pathway in hypoxia specifically in this cells and check.

      To address this, we blocked JAK/STAT signaling specifically in fat body/oenocytes under hypoxia and examined the expression of Nos, as well as bnl and Hipk - two additional genes we find are regulated by gut-derived Upd3 and required for hypoxia tolerance.

      Interestingly, fat body/oenocyte-specific knockdown of STAT92E suppressed hypoxia-induced Hipk expression but did not affect Nos or bnl expression in these tissues. These results suggest that gut-derived Upd3 can control fat body/oenocyte expression of hypoxia regulators through both direct and indirect (relay) mechanism There is precedent for indirect, relay in the context of other Upd3/Upd2-mediated inter-organ responses. For example, in response to CO2, neuronal Upd3 controls blood cell differentiation in the lymph gland; however, this effect is not direct - Upd3 first signals to the fat body to induce Dilp6 expression, and Dilp6 then signals to the lymph gland to regulate hematopoiesis. A second example involves gut-derived Upd2: upon infection, Upd2 controls olfactory behavior, but does so via a relay in which Upd2 signals to glial cells, which in turn alter apolipoproteins expression, and these then modify olfactory neuron function.

      We have incorporated the new tissue-specific data into the manuscript and expanded the Discussion to address both direct and indirect modes of Upd3 action. (Fig 5 and lines 427-441)

      3) The authors relate the HIF-dependent limitation of Upd3 induction in hypoxia to regulation of cytokine-dependent immune responses in mammals; specifically they propose a parallel with a cytokine storm. This relationship is unclear to this reviewer, as in the Drosophila response Upd3 fulfils a signalling function (rather than immunological). I suggest they consider modifying this assumption.

      We appreciate this comment. Our intent in drawing a comparison to mammalian cytokine storm response was to illustrate the concept of fine-tuning cytokine responses, where too little or too much signaling can be deleterious, as we observe when comparing upd3 mutants to upd3-overexpressing animals. We have revised the Discussion to retain this concept while tempering the suggestion that our findings directly mirror cytokine storm pathologies in human (lines 511-536).

      4) Mitigation of lethality of HIF KD flies in Upd3 heterozygotes is very modest. Thus, the conclusion that one of the mechanisms by which HIF mediates adaptation to hypoxia is through inhibition of Upd3 expression is not sufficiently supported by the data. It seems like an over-interpretation of the results.

      We agree that the rescue is modest, and we would argue this may be expected given HIF-1's role as a master regulator that coordinates many gene expression changes required for hypoxia tolerance. Loss of HIF-1 therefore likely disrupts multiple essential processes simultaneously - including metabolic reprogramming and tracheal remodeling - that may not be restored by reducing upd3 dosage. We take the reviewer's point that this should not be framed as a primary mechanism. The partial reversal of lethality in upd3 heterozygotes nonetheless implicates excessive Upd3 signaling as one small component of what HIF-1 does to promote hypoxia adaptation, and we have revised the manuscript language to reflect this more measured interpretation (lines 529-536).

      5) HIF expression is well-known to reduce ROS levels in hypoxia by controlling mitochondrial activity through a wide array of mechanisms. Thus, this reviewer feels that the experiments utilized to rule out a role of ROS in fat body-to-gut communication are insufficient. Catalase reduces hydrogen peroxide levels, but not necessarily other reactive oxygen species. The authors might try to express other ROS scavengers such as superoxide dismutase. In addition, expression of scavengers should be carried out both at the fat body and gut.

      We thank the reviewer for this important point. We have now addressed it by overexpressing CatA, SOD1, or SOD2 individually in either fat body or enterocytes and measuring hypoxia-induced upd3 expression in each case. In all six conditions, hypoxia-induced upd3 expression was unaffected (Figs. S6B–G). Together, these experiments scavenge both hydrogen peroxide and superoxide in both tissues and collectively argue against a role for ROS in mediating upd3 induction

      __MINOR CONCERNS __ 6) The authors state that hypoxic upregulation of Upd3 in the gut occurs mostly in "large epithelial enterocytes". In Figure 3B, it is evident that GFP does not express in all cells; please utilize cell-type specific markers to identify which cells do express the cytokine.

      We appreciate this suggestion. Despite multiple requests to different laboratories, we were unable to obtain antibodies suitable for marking enterocyte subtypes in this context. To address the question of cell identity genetically, we used drivers specific for enterocytes (mex-GAL4) or progenitor cells (stem cells and enteroblasts; esg-GAL4) to drive RNAi-mediated knockdown of upd3 and then measured the effect on hypoxia-induced upd3 expression in whole guts. These experiments indicate that hypoxia-induced upd3 expression occurs mostly in enterocytes, with a smaller contribution from progenitor cells. This mirrors previous findings showing that infection-induced upd3 induction occurs in both enterocytes and enteroblasts, and supports our conclusion that enterocytes are the predominant source of hypoxia-induced Upd3. We have incorporated these results into the revised manuscript (Fig 3C and Fig S2C).

      7) The title of Fig 4 caption reads "Gut-derived upd3 controls adipose expression of hypoxia regulators." Only one hypoxia regulator has been analysed: Nitric Oxide Synthase. Please change the title to "Gut-derived upd3 controls adipose expression of Nitric Oxide Synthase."

      In the revised manuscript we now show that gut-derived Upd3 controls the expression of Nos, bnl, and Hipk in fat body and oenocytes, and that all three genes are required for hypoxia tolerance. We have therefore revised the figure title, to better reflect the findings presented in this version.

      8) Supplementary Figures 1 A and B lack statistical analysis.

      We have now included the statistical analyses in the revised manuscript figures.

      Reviewer 2


      __(Evidence, reproducibility and clarity (Required)): __This study by Ding and colleagues identifies a novel role for the cytokine Unpaired-3 (upd3) and the JAK/STAT signaling pathway coordinate a whole-body response to systemic hypoxia in Drosophila. The authors describe how low-oxygen conditions rapidly induce upd3 expression in both larvae and adults. Interestingly, this pathway's importance is sex-specific, as female flies require upd3 for survival in hypoxia, while males do not.

      Intriguingly, the authors identify the intestine as a crucial source of the hypoxia-induced upd3. This gut-derived upd3 then signals to the fat body and oenocytes, promoting the expression of nitric oxide synthase, which is essential for hypoxia tolerance. Furthermore, the study reveals an unexpected role for the transcription factor HIF-1α/sima as a molecular brake. Instead of simply promoting the hypoxia response, sima prevents the overproduction of upd3, demonstrating that a precise dosage of this cytokine is necessary for survival. The findings define a novel gut-to-fat/oenocyte signaling axis that coordinates systemic hypoxia adaptation and highlights the fly as an ideal system for studying interorgan communication during bouts of hypoxia. Overall, I find this manuscript an important step forward in understanding the link between hypoxia signaling and inflammation.

      __ (Significance (Required)): __This study is of high significance, as it not only demonstrates that a clear role for cytokine signaling in the Drosophila hypoxia response, but also demonstrates this response requires interorgan communication between adipose tissue and the intestine. Moreover, the study reveals a clear role for Hif1alpha in modulating upd3 expression, suggesting that this highly conserved transcription factor play a key role in fine tuning the inflammatory response.

      I think these findings are of broad interest and are potentially relevant to two aspects of public health. First, I believe the findings should be of particular interest to anyone studying hypoxic injuries, such as stroke and ischemia-reperfusion. Secondly, the observations could be relevant to a previous study that revealed an important role for hypoxia signaling in the mosquito larval intestine. Thus, this study could be important for revealing new mechanisms for inhibiting mosquito development, which would be of broad public health interest.

      Finally, I would highlight how this study raises a number of important question. Why are there sex-specific differences for upd3 in the hypoxia response? What is the signal from the fat body to the intestine? How does sima modulate upd3 signaling. Thus, I think this manuscript represents a foundation study that will be the motivation for numerous high-impact papers in the future.__ ____ __ We thank Reviewer 1 for their careful reading and comments on the manuscript. We are pleased that they found this to be "a study of high significance” that will be importance for our understanding of hypoxia and health. We have addressed each of their concerns in the revised manuscript as outlined below.

      __Major Concerns and Suggestions: __ I have no real for the manuscript as written - the experiments are well designed and control, the results, as presented, support the major conclusions. While there are clearly open questions, including what it the basis of the sex-specific effects, how does sima modulate upd3 expression, and what is the signal communicating fat body sima activity with intestinal upd3 expression, these open questions do NOT diminish the importance of the study.

      My only major concern is that the current draft lacks a discussion of previous studies in the mosquito Aedes aegypti, where hypoxia signaling plays a key role in larval development (https://doi.org/10.1073/pnas.1719063115). This body of literature should be incorporated into the discussion, as it hints at a conserved molecular mechanism.

      We thank the reviewer for pointing us to this important study. Valzania et al. demonstrate that gut hypoxia acts as a systemic signal in Aedes aegypti larvae, activating HIF to coordinate fat body metabolism and whole-body growth. We agree this is relevant context for our findings, as both studies support the idea that the gut can function as a hypoxia sensor that controls whole-body physiology through effects on the fat body. We have incorporated this into our Discussion (lines 488-492).

      Minor comments:

      Please include a list of fly stocks used in the methods with complete genotypes. Whenever possible, include the RRID number for the stock - these can be found on the BDSC page for the stock.

      We have now added the list of fly stocks as well as a supplemental table with full genotypes.

      Line 477-479 - provide citations that sima regulates glycolysis in the fly.

      We have now added these citations

      Lines 501-505 - please state if gasses were premixed or mixed in lab. Also, were flies contained in standard food vials during the exposure?

      We have now provided more detail on these points – the gases were premixed and flies were on standard food vials during the exposure.

      Lines 507-513 - how long after the hypoxia exposure were the flies assayed?

      We have now provided more detail on this point in the methods (lines 592-596) – the flies were assessed 24hrs after hypoxia exposure.

      In figures that display qRT-PCR data, please note that data were normalized to reference genes listed in Table S2.

      We have now added this methodological point.

      Please reference Flybase in either the acknowledgements or methods and include citations to the latest Flybase papers published in Genetics.

      We have now acknowledged Flybase and referenced the relevant papers

      Genetics nomenclature is inconsistent throughout the study, a few examples included: Figure legend 1 - italicize gene names Figure 2 legend - italicize upd3-null Line 259 - Capitalize gal4 Figure 4 legend - NOS is written in all capital, but in line 270, written as Nos. Please be consistent. Line 297 - gal4 is lower case, in contrast with elsewhere.

      We have now made these corrections

      Additional suggestions:

      While not required for publication, it would be interesting to examine intestinal upd3 expression when sima is inappropriately stabilized in the fat body of animals under normoxic conditions. This could be achieved by driving a fatiga-RNAi construct within the fat body.

      We did carry out this experiment but didn’t see any effect of fat body fatiga RNAi on gut upd3 levels.

      Reviewer 3


      Evidence, reproducibility and clarity (Required)): __Summary: While local cellular and organ adaptations to hypoxia are well-documented, organism-wide responses to systemic hypoxia are still not well understood. In this paper, the writers were interested in investigating how organisms adapt to systemic hypoxia. From their investigations, they were able to show that gut-derived upd3 is crucial to animals' tolerance to hypoxia. They also show that the master hypoxia regulator Sima is required to keep the upd3 level in check to avoid the deleterious effect of excess upd3. They also showed that the fatbody Sima is important in the regulation of gut-upd3 level, showing an inter-organ communication network in the adaptation to systemic hypoxia. One of their findings shows sex dimorphism in hypoxia tolerance; however, they did not show the mechanism behind this. I think the major weakness is not knowing how the animal actually fail to survive. What causes reduced survival should be explored. Generally, the studies show how animals adapt to systemic hypoxia, this knowledge is important in systemic hypoxia pathology.

      __

      __Significance (Required)): __This paper explores how the organism copes with hypoxia, and explored how Upd from the gut plays a role in mediating this response in the fat body and the oenocytes

      We thank Reviewer 1 for their careful reading and comments on the manuscript. We have addressed each of their concerns in the revised manuscript as outlined below.

      __Major comment: __

      Figure 1: The authors clearly showed that Upd3 level was up in the hypoxia condition and is important for animal tolerance to hypoxia. Apart from Upd3, are there other members of the unpaired family increasing and involved in hypoxia tolerance?

      We thank the reviewer for this question. We examined expression of all three unpaired family members and found that both upd2 and upd3 are induced by hypoxia, while upd1 is not. We also have preliminary evidence that upd2 mutants show reduced hypoxia survival, and that this effect is not additive with loss of upd3. While these early results are intriguing, this paper is focused on defining the role of upd3 in hypoxia tolerance, and exploring upd2, both alone and in combination with upd3, across different aspects of hypoxia biology we see as the basis of future investigations.

      Notably, co-induction of upd2 and upd3 by the same stress is a recurring theme in Drosophila biology, yet their respective contributions to organismal physiology are complex - sometimes overlapping, sometimes distinct - and in many studies only one family member has been characterized in detail. Indeed, our current understanding of how upd2 and upd3 each contribute to responses to infection, high-fat diet, and other stresses has emerged from the collective findings of multiple independent studies rather than from any single paper addressing both cytokines simultaneously. For example, during infection both Upd2 and Upd3 are induced in the gut to promote stem cell-mediated repair, yet only Upd2 has been shown to additionally signal to the brain to control olfactory behavior. Similarly, on a high-fat diet both cytokines are upregulated, but with distinct effects on different aspects of organismal biology: enterocyte-derived Upd3 promotes intestinal stem cell divisions, hemocyte-derived Upd3 controls fat body lipid levels, and fat body-derived Upd2 alters nephrocyte function. We see the current study as a foundation for broader investigations into unpaired cytokine biology in hypoxia. Indeed, Reviewer 2 noted that this manuscript "represents a foundation study that will be the motivation for numerous high-impact papers in the future", and we anticipate that the effects of Upd2 and Upd3 in hypoxia will prove similarly pleiotropic and resolving their respective contributions to different aspects of organismal biology in low oxygen will require dedicated future investigation.

      Figure 2: From the method, female and male flies were subjected to different durations of hypoxia, 24-28 hours for females and 16-18 hours for males. What happens when subjecting different sexes to similar periods of hypoxia?

      We thank the reviewer for this question. Males and females show inherently different sensitivities to hypoxia, as they do for other environmental stresses such as starvation. To reliably detect genetic effects on hypoxia tolerance, it is important to use exposure conditions that produce partial lethality in controls (50-80% survival), ensuring experiments are conducted within the appropriate range of hypoxic sensitivity for each sex. Because males and females differ in their sensitivity, no single timepoint satisfies this criterion for both sexes. When males are exposed for the same duration used in female experiments (24-28h), all animals - controls and experimental genotypes alike - die, precluding any meaningful comparison. Conversely, exposing females to the shorter timepoint used for males (16-18h) produces no detectable lethality, making it equally uninformative. The sex-specific exposure durations we use are therefore an experimental design choice that allows us to assess hypoxia tolerance appropriately in each sex.

      Upon concluding that gut derived upd affects fat and oenocytes, it is a bit strange that the qPCR is done in the abdomen, which is presumably where the gut is. Should the gut be excluded in these assays?

      We thank the reviewer for raising this point. For abdominal qRT-PCR experiments examining fat body and oenocyte gene expression, we dissected and removed the gut and ovaries prior to RNA extraction, leaving an abdominal sample enriched in fat body and oenocytes. We have clarified this in the Methods and Results section of the revised manuscript (Lines 245-246 and 626-627).

      It is important to establish how the animals die under hypoxia.

      We thank the reviewer for raising this important question. Our results show that gut-derived Upd3 is required for hypoxia tolerance in part through its control of Nos, bnl, and Hipk expression in fat body and oenocytes, and that knockdown of each of these genes individually reduces hypoxia survival. However, precisely why animals die when upd3 or these downstream effectors are lost remains an open question, and we discuss much of what we outline below in the revised manuscript Discussion (lines 443-466).

      All three effectors are signaling molecules, and we speculate that they likely coordinate further downstream processes required for hypoxia tolerance, either within fat body and oenocytes or by acting on other tissues. In particular, both bnl, an FGF ligand, and nitric oxide, produced downstream of Nos, have established roles in tracheal development and remodeling, raising the possibility that Upd3-dependent regulation of tracheal responses to hypoxia contributes to survival. Nitric oxide can also regulate nitrosylation and has been shown to affect the unfolded protein response, a conserved pathway induced by hypoxia. bnl, in addition to its role in tracheal remodeling, has been shown to regulate metabolic changes in target tissues. Hipk is a kinase with likely many downstream targets and has been shown in flies to control metabolism and mitochondrial function. Together, these observations suggest that Upd3 engages a broad downstream signaling network, the full scope of which remains to be defined.

      We think this situation is analogous to other environmental stresses such as starvation, where survival requires the coordinated regulation of a spectrum of physiological processes across multiple tissues, and where even well-characterized regulators are known to engage many downstream targets and pathways. We see the current paper as establishing the gut-to-fat body Upd3 requirement for hypoxia tolerance, and we suggest this lays a foundation for future exploration of the full spectrum of Upd3 targets and investigation of how they coordinate adaptive responses to low oxygen.

      Figure 3-6: Controls for RNAi experiments - is there any reason for not using RNAi-specific control, such as mcherry-RNAi, lacZ-RNAi, etc, rather than a wildtype control in all the RNAi-mediated knockdowns? Please address this. Don't necessarily have to repeat all the experiments using RNAi-specific control, but repeating just a few to show that both wild-type and UAS-RNAi-specific controls show similar results would be important.

      We thank the reviewer for raising this point. To address potential non-specific effects of RNAi expression on hypoxia tolerance, we expressed control GFP RNAi or mCherry RNAi transgenes using the main Gal4 drivers employed in this study: mex-Gal4 (gut) and desat;r4-Gal4 (fat body and oenocytes), and found no effect on hypoxia survival compared to wild-type controls (Fig S2E and S4B). These results indicate that RNAi expression per se does not adversely affect hypoxia tolerance, and that the survival effects we observe reflect specific knockdown of the genes of interest.

      Although gut-derived upd3 contributes largely (40%) to hypoxia tolerance, what other tissues' upd3 is important for hypoxia tolerance?

      We thank the reviewer for this important question. We find that upd3 is induced in multiple tissues during hypoxia, including the head, thorax, and abdomen. However, when we knocked down upd3 using drivers targeting the major cell types in these tissues, including muscle, neurons, and fat body/oenocytes, we observed no significant effect on hypoxia survival, in contrast to the robust effect seen with gut-specific knockdown. These new data, included in the revised manuscript, suggest that gut-derived Upd3 is a primary contributor to hypoxia tolerance (Fig S3).

      That said, we do not conclude that the gut is the only relevant source. Other tissues we have not yet examined, including hemocytes, glia, and tracheal cells, may also contribute, and it is possible that Upd3 produced from multiple tissues acts redundantly, such that knockdown in any single tissue other than the gut is insufficient to cause a survival defect. By analogy with other stress contexts such as nutrient deprivation and infection, where upd cytokines are produced from multiple tissues and exert distinct effects on different aspects of physiology, we anticipate that Upd3 from tissues other than the gut may well contribute to hypoxia tolerance. However, fully defining these contributions will require detailed tissue-specific experiments that are beyond the scope of the current paper and will be the focus of future investigations. We have expanded on this point in the Discussion of the revised manuscript (lines 420-425).

      Can you use a hypoxia readout to experimentally show that the gut is the main sensor of hypoxia compared to other tissues? Looking at the data, the fatbody could also be major sensors of hypoxia. Therefore, investigating hypoxia readout in these and other tissues would further strengthen the direction of communication.

      We thank the reviewer for this suggestion, however, we wish to clarify that we are not claiming the gut is the main or primary sensor of hypoxia. All tissues are likely capable of sensing low oxygen and mounting cell-autonomous responses, and in some cases perhaps also non-autonomous signals to other tissues. Our findings specifically show that one consequence of gut hypoxia sensing is upregulation of Upd3, which then acts as an inter-organ signal to coordinate responses in target tissues such as the fat body and oenocytes. The fat body itself also senses hypoxia and mounts its own responses, as we and others have shown, including HIF-dependent regulation of gut Upd3 expression described in this paper. An analogous situation exists during nutrient starvation, where all cells autonomously sense and respond to nutrient deprivation, but on top of these cell-autonomous responses, specific tissues also mediate inter-organ signaling to coordinate whole-body physiological adaptations. We propose that hypoxia responses are organized similarly, and that the gut-to-fat body Upd3 signaling axis we describe here represents one such inter-organ communication pathway. We have clarified this point in the revised manuscript (lines 468-492).

      __Minor comment:

      __

      Should check the alignment of the confocal image in Figure 3b, especially the top panel.

      We have now fixed the images to better align them

      Figure 6: "gut-specific sima knockdown (mex>sima-RNAi) did not significantly alter intestinal upd3 mRNA levels compared to controls (mex>+) under hypoxic conditions (Figure 6C)." This statement refers to Figure 6B, not Figure 6C

      We have now corrected this

      Since the fat body Sima non-autonomously control the gut upd3 level, can you also show this functionally important by investigating the animal's survival or other functional studies?

      We thank the reviewer for this suggestion. Ideally, we would manipulate sima and upd3 independently and in parallel, knocking down sima specifically in the fat body while simultaneously reducing upd3 in the gut, to directly test the functional importance of this inter-organ axis for survival. In principle this could be achieved using orthogonal binary expression systems such as the GAL4/UAS and QF/QUAS systems in combination, but this would require the development of new genetic tools. An additional challenge is that based on our results, such experiments would require fine-tuned reduction of gut upd3, sufficient to suppress the elevated levels caused by fat body sima knockdown, but not so low as to itself compromise survival, as we have shown that loss of upd3 is detrimental. For these reasons, while we agree these would be, in principle, interesting experiments, they would technically be challenging to carry out.

      Strangely, all the statistically significant data/results from both supplementary and main figures had a one-star significance even in graphs with very obvious differences and less sample variation.

      We thank the reviewer for this observation. In all figures, a single asterisk is used to denote statistical significance at p < 0.05, regardless of whether the actual p value is substantially lower. This is a presentation convention we adopted consistently across all figures rather than a reflection of the strength of the underlying differences.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: While local cellular and organ adaptations to hypoxia are well-documented, organism-wide responses to systemic hypoxia are still not well understood. In this paper, the writers were interested in investigating how organisms adapt to systemic hypoxia. From their investigations, they were able to show that gut-derived upd3 is crucial to animals' tolerance to hypoxia. They also show that the master hypoxia regulator Sima is required to keep the upd3 level in check to avoid the deleterious effect of excess upd3. They also showed that the fatbody Sima is important in the regulation of gut-upd3 level, showing an inter-organ communication network in the adaptation to systemic hypoxia. One of their findings shows sex dimorphism in hypoxia tolerance; however, they did not show the mechanism behind this. I think the major weakness is not knowing how the animal actually fail to survive. What causes reduced survival should be explored. Generally, the studies show how animals adapt to systemic hypoxia, this knowledge is important in systemic hypoxia pathology.

      Major comment:

      • Figure 1: The authors clearly showed that Upd3 level was up in the hypoxia condition and is important for animal tolerance to hypoxia. Apart from Upd3, are there other members of the unpaired family increasing and involved in hypoxia tolerance?
      • Figure 2: From the method, female and male flies were subjected to different durations of hypoxia, 24-28 hours for females and 16-18 hours for males. What happens when subjecting different sexes to similar periods of hypoxia?
      • Upon concluding that gut derived upd affects fat and oenocytes, it is a bit strange that the qPCR is done in the abdomen, which is presumably where the gut is. Should the gut be excluded in these assays?
      • It is important to establish how the animals die under hypoxia.
      • Figure 3-6: Controls for RNAi experiments - is there any reason for not using RNAi-specific control, such as mcherry-RNAi, lacZ-RNAi, etc, rather than a wildtype control in all the RNAi-mediated knockdowns? Please address this. Don't necessarily have to repeat all the experiments using RNAi-specific control, but repeating just a few to show that both wild-type and UAS-RNAi-specific controls show similar results would be important.
      • Although gut-derived upd3 contributes largely (40%) to hypoxia tolerance, what other tissues' upd3 is important for hypoxia tolerance?
      • Can you use a hypoxia readout to experimentally show that the gut is the main sensor of hypoxia compared to other tissues? Looking at the data, the fatbody could also be major sensors of hypoxia. Therefore, investigating hypoxia readout in these and other tissues would further strengthen the direction of communication.

      Minor comment:

      • Should check the alignment of the confocal image in Figure 3b, especially the top panel.
      • Figure 6: "gut-specific sima knockdown (mex>sima-RNAi) did not significantly alter intestinal upd3 mRNA levels compared to controls (mex>+) under hypoxic conditions (Figure 6C)." This statement refers to Figure 6B, not Figure 6C
      • Since the fat body Sima non-autonomously control the gut upd3 level, can you also show this functionally important by investigating the animal's survival or other functional studies?
      • Strangely, all the statistically significant data/results from both supplementary and main figures had a one-star significance even in graphs with very obvious differences and less sample variation.

      Significance

      This paper explores how the organism copes with hypoxia, and explored how Upd from the gut plays a role in mediating this response in the fat body and the oenocytes

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study by Ding and colleagues identifies a novel role for the cytokine Unpaired-3 (upd3) and the JAK/STAT signaling pathway coordinate a whole-body response to systemic hypoxia in Drosophila. The authors describe how low-oxygen conditions rapidly induce upd3 expression in both larvae and adults. Interestingly, this pathway's importance is sex-specific, as female flies require upd3 for survival in hypoxia, while males do not.

      Intriguingly, the authors identify the intestine as a crucial source of the hypoxia-induced upd3. This gut-derived upd3 then signals to the fat body and oenocytes, promoting the expression of nitric oxide synthase, which is essential for hypoxia tolerance. Furthermore, the study reveals an unexpected role for the transcription factor HIF-1α/sima as a molecular brake. Instead of simply promoting the hypoxia response, sima prevents the overproduction of upd3, demonstrating that a precise dosage of this cytokine is necessary for survival. The findings define a novel gut-to-fat/oenocyte signaling axis that coordinates systemic hypoxia adaptation and highlights the fly as an ideal system for studying interorgan communication during bouts of hypoxia. Overall, I find this manuscript an important step forward in understanding the link between hypoxia signaling and inflammation.

      Major Concerns and Suggestions:

      I have no real for the manuscript as written - the experiments are well designed and control, the results, as presented, support the major conclusions. While there are clearly open questions, including what it the basis of the sex-specific effects, how does sima modulate upd3 expression, and what is the signal communicating fat body sima activity with intestinal upd3 expression, these open questions do NOT diminish the importance of the study.

      My only major concern is that the current draft lacks a discussion of previous studies in the mosquito Aedes aegypti, where hypoxia signaling plays a key role in larval development (https://doi.org/10.1073/pnas.1719063115). This body of literature should be incorporated into the discussion, as it hints at a conserved molecular mechanism.

      Minor comments:

      Please include a list of fly stocks used in the methods with complete genotypes. Whenever possible, include the RRID number for the stock - these can be found on the BDSC page for the stock.

      Line 477-479 - provide citations that sima regulates glycolysis in the fly.

      Lines 501-505 - please state if gasses were premixed or mixed in lab. Also, were flies contained in standard food vials during the exposure?

      Lines 507-513 - how long after the hypoxia exposure were the flies assayed?

      In figures that display qRT-PCR data, please note that data were normalized to reference genes listed in Table S2.

      Please reference Flybase in either the acknowledgements or methods and include citations to the latest Flybase papers published in Genetics.

      Genetics nomenclature is inconsistent throughout the study, a few examples included:

      Figure legend 1 - italicize gene names

      Figure 2 legend - italicize upd3-null

      Line 259 - Capitalize gal4

      Figure 4 legend - NOS is written in all capital, but in line 270, written as Nos. Please be consistent.

      Line 297 - gal4 is lower case, in contrast with elsewhere.

      Additional suggestions:

      While not required for publication, it would be interesting to examine intestinal upd3 expression when sima is inappropriately stabilized in the fat body of animals under normoxic conditions. This could be achieved by driving a fatiga-RNAi construct within the fat body.

      Significance

      This study is of high significance, as it not only demonstrates that a clear role for cytokine signaling in the Drosophila hypoxia response, but also demonstrates this response requires interorgan communication between adipose tissue and the intestine. Moreover, the study reveals a clear role for Hif1alpha in modulating upd3 expression, suggesting that this highly conserved transcription factor play a key role in fine tuning the inflammatory response.

      I think these findings are of broad interest and are potentially relevant to two aspects of public health. First, I believe the findings should be of particular interest to anyone studying hypoxic injuries, such as stroke and ischemia-reperfusion. Secondly, the observations could be relevant to a previous study that revealed an important role for hypoxia signaling in the mosquito larval intestine. Thus, this study could be important for revealing new mechanisms for inhibiting mosquito development, which would be of broad public health interest.

      Finally, I would highlight how this study raises a number of important question. Why are there sex-specific differences for upd3 in the hypoxia response? What is the signal from the fat body to the intestine? How does sima modulate upd3 signaling. Thus, I think this manuscript represents a foundation study that will be the motivation for numerous high-impact papers in the future.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors describe a role of Unpaired 3 (Upd3) in tissue communication in responses to hypoxia in Drosophila adult flies. Upd3 mRNA is strongly upregulated in hypoxia, along with well-characterized JAK/STAT downstream target genes, in both adult fly males and females, as well as in larvae. Interestingly, adult females but not males require Upd3 for 15 to 24 h survival in hypoxia, as Upd3 mutant females but not males die to a much larger proportion in these conditions. Adult females they display strong hypoxic upregulation of Upd3 in the gut, assessed by RT-PCR or through a Gal4 transcriptional reporter, mainly in epithelial enterocytes. Enterocyte-specific RNAi-mediated KD indicated that this enterocyte expression of Upd3 represents about 40% of Upd3 expression in the whole body. Enterocyte-specific KD of Upd3 in adult females significantly reduced survival in hypoxia, suggesting that this expression is critical for hypoxic adaptation. Tissue-specific analysis of the expression of the STAT target genes, SOCS36E, TotA and TotM revealed that stimulation of the JAK/STAT pathway in hypoxia is widespread, although more pronounced in abdominal tissues. Indeed, overexpression of Upd3 in enterocytes provokes upregulation o both target genes TotA and TotM. Consistent with this RNAi-dependent inhibition of the JAK/STAT pathway in the fat body and oenocytes significantly reduced survival of female flies in hypoxia. Nitric oxide synthase (NOS) is strongly upregulated in adult female abdomens upon hypoxic exposure, and KD of NOS in fat body and oenocytes reduced hypoxic survival. Surprisingly, the found that ubiquitous KD of HIFa/Sima led to mitigation of Upd3 hypoxic induction and, more clearly, to JAK/STAT target gene induction. HIF KD flies displayed increased lethality in hypoxia, and this lethality was slightly mitigated in Upd3 heterozygous flies. The authors conclude that increased lethality of HIF-minus flies in hypoxia stems at least in part from excessive levels of Upd3. The authors then find that HIF/Sima-dependent inhibition of Upd3 expression is non-cell autonomous, since KD of Sima specifically in the gut does not affect expression of Upd3 in this organ. Instead, Sima KD at the fat body led to significant increase of Upd expression in the gut, suggesting that a Sima-born signal communicates these two organs, leading to restriction of Upd3 intestinal expression. ROS does not seem to be the signal that communicates the fat body with the gut, as expression of catalase in the fat body did not affect expression of Upd3 in the gut.

      Major concerns

      1) Figure 1 lacks statistical analysis. It is important to determine if the apparent differences in gene expression are statistically significant.

      2) Is NOS expression in fat body/oenocytes JAK/STAT-dependent? Block the pathway in hypoxia specifically in this cells and check.

      3) The authors relate the HIF-dependent limitation of Upd3 induction in hypoxia to regulation of cytokine-dependent immune responses in mammals; specifically they propose a parallel with a cytokine storm. This relationship is unclear to this reviewer, as in the Drosophila response Upd3 fulfils a signalling function (rather than immunological). I suggest they consider modifying this assumption.

      4) Mitigation of lethality of HIF KD flies in Upd3 heterozygotes is very modest. Thus, the conclusion that one of the mechanisms by which HIF mediates adaptation to hypoxia is through inhibition of Upd3 expression is not sufficiently supported by the data. It seems like an over-interpretation of the results.

      5) HIF expression is well-known to reduce ROS levels in hypoxia by controlling mitochondrial activity through a wide array of mechanisms. Thus, this reviewer feels that the experiments utilized to rule out a role of ROS in fat body-to-gut communication are insufficient. Catalase reduces hydrogen peroxide levels, but not necessarily other reactive oxygen species. The authors might try to express other ROS scavengers such as superoxide dismutase. In addition, expression of scavengers should be carried out both at the fat body and gut.

      Minor concerns

      6) The authors state that hypoxic upregulation of Upd3 in the gut occurs mostly in "large epithelial enterocytes". In Figure 3B, it is evident that GFP does not express in all cells; please utilize cell-type specific markers to identify which cells do express the cytokine.

      7) The title of Fig 4 caption reads "Gut-derived upd3 controls adipose expression of hypoxia regulators." Only one hypoxia regulator has been analysed: Nitric Oxide Synthase. Please change the title to "Gut-derived upd3 controls adipose expression of Nitric Oxide Synthase."

      8) Supplementary Figures 1 A and B lack statistical analysis.

      Significance

      This is an important study, because most previous studies have focused on cell-autonomous responses to hypoxia, but much less is known about systemic responses to low oxygen conditions, particularly in relation to inter-organ communication during this responses. This work defines the cytokine unpaired 3, homolog of human interleukin 6, as a major regulator of systemic responses to hypoxia. Future studies will determine if interleukin 6 plays similar roles in mammals. This work might be of interest for a broad audience interested in responses to hypoxia, as well as general physiology.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Goldman et al., Response to Review April 16, 2026

      Referee 1 (R1): The revised manuscript by Goldman et al has changed significantly since its first submission. Upon further analysis of a putatively truncated mutant, the authors have removed the interpretation that G2019S LRRK2 alters iron handling through a kinase-independent mechanism. They should be commended for the transparency and rigor applied in this revision and the resulting data are conceptually easier to digest.

      Heidi McBride (HM): We thank the reviewer for their understanding. We would point out that our phenotypes remain resistant to the commonly used LRRK2 kinase inhibitor MLi2, yet are sensitive to the type II kinase inhibitors which we believe is novel.

      R1: However, now that the proposed pathway involves over-active LRRK2 kinase activity and a putative role for Rab8a phosphorylation, the manuscript is rather descriptive in nature. The mechanisms underlying dramatic changes in NCOA4, FTL, and FTH protein levels upon iron exposure are not explored. In addition, the reliance on these effects, or the precise involvement of LRRK2 substrates is not established.

      HM: We agree there remains much to be uncovered mechanistically. However, the fact that phosphorylated Rab8 is active at the plasma membrane upon iron overload is novel, suggesting more broad roles beyond the Golgi/lysosomal localization of LRRK2 substrates. We would also argue that the majority of studies on LRRK2 follow the phosphorylation of Rab substrates by western blot, but the functional roles of these Rabs, their GTPase cycles and effectors, in membrane traffic is not at all clear within the field. It is difficult for us to determine the mechanisms by which NCOA4, FTL and FTH are altered, but we did expand our work in the revision to include full proteomics analysis of control and iron treated cells with wild type or LRRK2G2019S to get a better picture of the changes that occur. This included major changes in lysosomal hydrolyases, induction of oxidative stress pathways linked to ferroptotic cell death, and cytoskeletal changes that may link to the plasma membrane blebbing we observe. Our data indicate that the hyperactivity of LRRK2 during iron overload is having significant impact on the capacity of the lysosome to handle the iron, which we observe also at the morphological level in the loss of microautophagy of NCOA4. Taken together, the impact of our work lies in setting a new framework by which to examine LRRK2 function in iron homeostasis.

      R1: The limitation of a single LRRK2 mutant (in the context of a more canonical increased kinase activity mechanism) is also more evident. G2019S is a relatively weak Rab kinase when compared to Y1699C or R1441C - is there a possibility for G2019S In addition, the impact of this iron defect is limited to oxidative cell injury and implications for ferroptosis in macrophages (not neurons, where these outcomes might be of greater relevance) whereas how these phenotypes affect the immune-like functions of the chosen cell type (e.g. cytokine release, chemotaxis, phagocytosis) is not addressed.

      HM: We agree that the examination of additional LRRK2 mutations in our model would be important, but the G2019S is the most common patient mutation and has a very strong impact on the cellular response to iron. As for the impact on cytokine release, chemotaxis and phagocytosis was not requested in the first round of review and we believe is beyond the scope of our current study.

      R1: While the proteomics and other analyses provide strong rigor for the data that are presented, without more mechanistic interrogation or consideration of cell type specific behaviors, the manuscript does not advance very far beyond confirming that iron related proteins altered in parallel with iron itself and highly dependent on iron loading to evoke phenotypes.

      HM: We thank the reviewer for recognizing the rigour of our revisions. We did interrogate the function of the oxidative stress signatures and performed experiments to demonstrate the oxidation of lipids in the lysosome along with assays for ferroptotic cell death. We highlighted key signatures that were changed, including the selective loss of lysosomal enzymes, and changes in cellular cytoskeletal elements. We agree that there is more to do, but our data has provided many important new observations on what exactly is changed in LRRK2G2019S cells in response to iron, from lysosomal morphology changes, to critical analysis of NCOA4 and ferritin dynamics, the novel observation of phosphorylated Rab8 at the cell surface, and a very unexpected insensitivity to MLi2 yet selective inhibition by new type II LRRK2 kinase inhibitors. With this we believe our work does provide a great deal of novelty and impact on LRRK2 functions in iron homeostasis.

      A final point on the concept that the phenotypes are only induced upon iron overload. We would point out that we observed changes in ferritin and NCOA4 levels in LRRK2G2019S cells even without iron overload, so there are indeed impact on lysosomal and iron homeostasis in steady state. It is also perhaps important to note that LRRK2 is established to be activated in response to cellular/lysosomal stressors, so it is not unusual to observe the impact of the mutations only in certain conditions. This would be consistent with the mutations being linked to an age-related disease with relatively low penetrance among heterozygous carriers. We would argue that it is important to understand exactly which types of cellular stressors may lead to pathology, and that iron dysregulation has been a long-standing candidate stressor in PD.

      **************************************************************************

      Referee 2 (R2): The authors have done all I have requested before.

      HM: We thank the reviewer for acknowledging our efforts in addressing each of the reviewers concerns.

      R2: I am concerned about the substantial changes to the original manuscript including the realisation that some clones were not the knock-outs expected. However, removing these, is the right thing.

      HM: We agree that the removal of the truncated clone, along with our identification of a class of kinase inhibitors that reversed the p-Rab8 upon iron overload led to changes in a key conclusion of the first submission; namely that the effects were kinase-independent. In our opinion, this does not change the fundamental observations we have made on the impact of LRRK2 on iron homeostasis, many of which are made here for the first time.

      R2: I still have major concerns that the phenotype could not be reproduced in BMDMs. The RAW cells are a good tool, but far from physiological. The authors' arguments that others have also seen a phenotype in iron homeostasis is however reassuring.

      HM: We tried to make it very clear throughout our manuscript that there is a cell specificity to the iron phenotypes observed in LRRK2G2019 backgrounds. As stated in our response to review, we did observe changes in NCOA4 within BMDM but not the levels of FTL and FTH. We will continue to explore the impact of LRRK2 mutations in primary lines in future work. However, this should not negate the phenotypes we observe in the RAW cells that are consistent with other observations from ssRNAseq, and the work of others looking at microglia upon LPS treatment in vivo, as an example.

      R2: I leave it to the editor if they feel that a somewhat artificial cell line is the right model.

      HM: The RAW macrophages are not “artificial”. Widely used in Parkinson’s research, cancer and immunity, the ATCC site describes “RAW 264.7 are an adherent cell line isolated from a mouse tumor that was induced by Abelson murine leukemia virus. This cell line, with macrophage differentiation, can be used in oxidative stress, inflammatory, and antibacterial activity studies.” We understand that the standard of cell culture is rapidly moving entirely to human iPSC derived lines and organoids. We argue that there is still a role for fundamental cell biological studies to be done in established cultured cell lines where we can more rapidly generate critical hypothesis to be tested in other model systems in future studies. There has been no published work linking LRRK2 to ferritinophagy and NCOA4 microautophagy before this, and we humbly submit that we have made some critical new observations in our study that will direct future work in a highly targeted manner.

      *********************************************************************

      Referee 3 (R3): We thank the authors for addressing our concerns. As a result, the work looks very different, and the main message has changed.

      HM: We thank the reviewer for acknowledging our efforts in the revision. We agree that one point of our study, that the effects of the LRRK2G2019S were kinase independent, has changed. Our initial observation that the effects were resistant to MLi2 remains, it was the use of a second class of kinase inhibitors that now confirmed a dependence on kinase activity. This is a very important point to be made about the different activities of these two classes of kinase inhibitors. In addition, the rest of our work stands and was extended in the revision with many new experiments highlighting the impact of LRRK2G2019S on iron homeostasis. Therefore the fact that LRRK2 linked iron phenotypes are not blocked by the common kinase inhibitor remains unchanged.

      R3: The main issue with this new manuscript relative to the previous one is that a key part of the novelty has been lost. In the original draft the authors claimed to have found a LRRK2 kinase-independent phosphorylation of Rab8. This was a major new finding potentially hinting that LRRK2 is not the only kinase for Rab8 which would have had major ramifications for the field. All data from this truncation mutant has been withdrawn from the manuscript and the message is now significantly different. It is unfortunate that the armadillo domain data from the original manuscript has been withdrawn due to the results of the proteomics experiment revealing 7 peptides C-terminal to their stop codon. I suggest that, independently of the fate of the current manuscript, the authors update the biorxiv preprint as in the current manuscript this is not addressed (as this part has been removed) and the community needs to be aware of the changes that are potentially misleading.

      HM: While the new data show that p-Rab8 can be inhibited by type II kinase inhibitors upon iron overload, we present a great deal of novelty showing that this activated Rab8 is, unexpectedly, at the plasma membrane upon iron loading in G2019S cells. We also performed experiments examining ferritinophagy and NCOA4 microautophagy in these conditions, which we argue are the first instance of these pathways being impacted by LRRK2. Our full proteomics datasets in Figure 4 provide an unbiased view of all changes driven by LRRK2G2019S that led us to additional functional analysis of lipid oxidation and ferroptosis pathways being impacted. Again, we agree that the concept that our phenotypes are no longer kinase independent, however this should not negate the many important observations we have made in our study.

      We apologize for not updating the manscript on Biorxiv. I had linked our response to reviews to the bioRxiv during submission to EMBO J and thought the manuscript would automatically be updated. So it was an oversight on my part, which has now been remedied, including all links to raw data files and protocols linked to Zenodo.

      R3: Despite the changes and modifications, the main concern remains. The authors are claiming in the title to have discovered that the LRRK2 G2019S mutation leads to more ferroptotic cell death. However, the only cell death data that is shown is a Sytox experiment and a C11-Bodipy stain. Although this is informative about whether cell death is occurring, the authors need more evidence that this is being caused by ferroptosis. For example, they could have probed for GPX4 levels in their lysates as the levels would decrease in ferroptosis. Including inducers and inhibitors of ferroptosis (erastin and ferrostatin-1) would confirm if the FAS-induced LRRK2 G2019S cell death phenotype is mediated by ferroptosis.

      HM: We appreciate that the impact of LRRK2G2019S on lipid oxidation and cell death were new to the revision and could have been explored further. However, as we were under a time constraint for the revisions we didn’t employ erastin and ferrostatin-1 in our study. The decision to look at ferroptosis came from the new proteomics analysis we did on wild type and G2019S cells in absence or presence of iron overload. This led us to uncover a major oxidation signature consistent with a sensitivity to ferroptosis. Therefore we used two common assays to test this, that was included in our new Figure 4. Ferroptosis is driven by oxidized lipids, which we observe in lysosomes using the C-11 Bodipy stains, and a rupture of the plasma membrane that can be monitored by the uptake of SYTOX green. SYTOX green uptake also occurs in pyroptotic and necrotic cell death, but the combination of oxidized lipids with SYTOX green is very commonly used to conclude the death is ferroptotic. In our case, we are also adding very high levels of iron to the cells (instead of using erastin), which would be further suggestive of ferroptosis.

      Our proteomics analysis does not show any changes in GPX4, but again, we are in conditions of high iron overload, which is very different from experiments with erastin that inhibit this channel to drive ferroptosis. The accumulation of iron in the late endosome, where we see the accumulation of oxidized C-11 Bodipy is consistent with several recent studies all demonstrating critical roles for lysosomal iron in the regulation of ferroptosis (see new review in Trends in Cell Biology, March 30 2026 DOI: 10.1016/j.tcb.2026.03.007). The bioRxiv paper looking at LRRK2G2019S and other mutants also showed increased oxidized lipids and suggested a susceptibility to ferroptosis (the paper R3 wanted us to cite during first review). Therefore, we agree that there will be much more to learn about how exactly LRRK2 may govern the lysosomal/cellular oxidation pathways, we consider our use of proteomics, lipid probes and cell death assays to be sufficient to conclude that LRRK2G2019 renders our macrophages sensitive to ferroptotic cell death upon iron overload.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors report an interesting phenotype of the LRRK2 G2019S mutation on iron homeostasis in RAW264.7 macrophages. The phenotype is well characterised through proteomic and western blot approaches investigating transferrin and ferritin trafficking. The study is well conducted and data of high quality. The authors also appear to have discovered a cellular context where Rab8 is phosphorylated independently of LRRK2. This is a major finding which can potentially have an important impact in the LRRK2 field. What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells. Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      Significance

      Major comments

      In Figure 1:

      • There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?
      • The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.
      • The transferrin phenotype validated through proteomics and western blot is solid.
      • Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.
      • Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability.

      In Figure 2:

      • If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?
      • The authors do not show any controls (PCR, sequencing) confirming knockout or truncation.
      • The data is interesting and the image quality with the insets is very high.

      In Figure 3:

      • Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear.
      • The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains.
      • In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown.

      In Figure 4:

      • In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.
      • The vinculin bands in figure 4A are misaligned with the rest of the bands.
      • The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed.
      • The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration. Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findings https://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript the authors describe an interesting connection between the Parkinson's kinase LRRK2 and iron trafficking in RAW macrophages. Expression of the LRRK2 G2029S mutation affects the abundance of ferritin heavy and light chains and therefore the uptake and storage of iron. Interestingly, the loss of the kinase domain still had a strong phenotype, suggesting that this is independent of the kinase function.

      The paper is well written and excellently cited. The data is convincing and of good quality.

      I have only one request and else very minor comments:

      Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs.

      The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      Minor comments:

      Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0. I would normalize to the average of the control, which will provide an error for the control.

      The raw data needs to be submitted to PRIDE or similar. This has not happened yet.

      Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      Significance

      Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Goldman et al describe some novel findings with respect to LRRK and iron handling in a series of RAW macrophage cell lines. This cell background is chosen for its recognized high levels of endogenous LRRK2 protein expression, its somewhat broad use in the field, and the investigators add its relevance due to phagocytosis of red blood cells, thus requiring iron robust metabolic processes. Proteomic analyses of WT and G2019S RAW cells revealed multiple iron-related proteins affected by LRRK2 mutation. A deeper candidate-based analysis revealed complex changes in ferritin heavy and light chain and changes in ferric and ferrous iron. Notably, reliable changes in the levels and/or solubility of NCOA4 result from this pathogenic LRRK2 mutation. Unexpectedly, however, these changes were not sensitive to LRRK2 kinase inhibitor treatment. The investigators suggest a dominant effect rather than loss-of-function as subsequent experiments revealed that these effects could be replicated with a LRRK2 variant lacking the kinase domain (LRRK2-ARM) and were not replicated by LRRK2 KO. The data are internally consistent throughout and could certainly shed new important light onto unique and unexpected effects of this LRRK2 mutation.

      There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability. Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 specie is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      • Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)
      • The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.
      • Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).
      • Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.
      • Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.
      • These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.
      • The legend for 2F is awkward (BSADQRED)
      • Why are WT cells not included in Fig 2G?
      • The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.
      • Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?
      • The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.
      • Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      Significance

      This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 Major comments: 1. Data demonstrated the statistical differences in MuSC behaviors between CD90+ve and CD90-ve cells. However, the difference is small. For example, it is unclear whether the minimal difference in CALCR expression level between CD90+ve and CD90-ve cells gives rise to any biological difference.

      We thank the reviewer for this thoughtful comment and agree that it is important to distinguish statistical significance from biological relevance. However, describing the differences as “small” is somewhat subjective, as it is not clear whether this refers to per-cell expression differences or the magnitude of downstream functional consequences. In the case of CALCR, even modest shifts in receptor abundance can be biologically meaningful in threshold-dependent ligand–receptor systems, and in our study, the CALCR difference is supported by concordant orthogonal readouts (flow cytometry/immunofluorescence and transcript levels, see revised Fig. 6 F-I) together with functional evidence showing differential sensitivity of CD90+ve versus CD90−ve MuSCs to Col6-mediated restraint (see Fig. 6 D-E). In addition, to directly address the Reviewer’s question, we now include in this revised version of the manuscript further experimental data showing a differential response between CD90+ve and CD90−ve MuSCs to pharmacological inhibition of the CALCR signaling pathway (see revised Fig. S9G, H, I). Taken together, these complementary molecular and functional findings indicate that, although the quantitative differences may not appear large in absolute terms, they are sufficient to generate consistent, measurable, and biologically meaningful outcomes.

      1. Negative controls of FACS analyses are required because different sizes of cells might exert different background intensities. (Figure 2I, 2L, and 6F).

      We have now added negative controls to Fig. 2L and 6F, demonstrating the specificity and size independence of the measured signal. Panel 2I, which highlighted size differences between CD90+ve and CD90−ve MuSCs in the previous version, is now relocated to Fig. S3J.

      1. If CD90+ve MuSCs express Col6 higher than CD90-ve MuSCs, they should also highly express the primary target of Notch target genes, Hes1, Hey1, and HeyL. The authors should examine the expression levels of these genes.

      We have now expanded Suppl Table 1 to include all genes upregulated in CD90+ve MuSCs with p-values

      1. As described above, the quantifications of many results, including MyoD, were based on the fluorescent intensity. I know the difficulty of preparing enough cells for experiments, but the authors need to present data supporting these results.

      The evaluation of fluorescent intensity as a reliable and sensitive readout of protein content has been used in multiple publications in the myogenesis field by independent authors and by us (de Morree et al. 2019; Florio et al. 2023; Vetter and Lawlor 2026; Zanotti et al. 2022). The availability of sufficient amounts of materials is an important limitation for proteomic studies, and we thank the Reviewer for acknowledging this. To demonstrate the reliability of our pixel quantification-based assay, we have confirmed selected datasets using an alternative quantification approach based on visual discrimination of MyoD+ve and MyoD-ve cells (see Fig. S3E and associated Fig. S3D in the revised version). The results we obtained confirm those from pixel-quantification. Moreover, to further corroborate the trustworthiness of our analytical approach based on pixel quantification, we performed in parallel western blot analysis and the evaluation of fluorescent intensity on proliferating and differentiating myoblasts using antibodies recognizing MyoD, one of the markers we used throughout the manuscript. The data derived from this parallel evaluation (see Additional Fig. 1 below) clearly demonstrate robust parallelism between the two quantification methods.

      1. Figure 7G-H; More quantitative analyses should be included. In addition, the sample number was different between Fig7E and H. There is no significant difference in the CD90 expression in Fig7G. The authors need to confirm the reproducibility.

      We thank the reviewer for the opportunity to clarify these points. We apologize for not having clearly explained the design of this experiment. Figure 7E refers to a different time-point than Fig. 7F-I (i.e., 1.5 vs. 4.5 days after injury; see Fig. 7D for a visualization of the experimental design). The analysis at 1.5 days post injury was performed in 3 independent biological replicates. The difference in sample numbers between Fig. 7F and Fig. 7H is due to the fact that the absolute cell count was not performed in one biological replicate of CD90-depleted muscle and one biological replicate of control muscle. As a result, those samples were included in the qualitative and morphometric analyses but could not be incorporated into the cell number quantification. Importantly, this discrepancy does not affect the overall outcome or statistical interpretation of the results. We have clarified this in the revised figure legends. Regarding Fig. 7G, we would like to specify that CD90 staining is not shown in this panel. We really apologize for the confusion. The images display laminin and embryonic MyHC (eMyHC) staining to highlight the size and regeneration status of newly formed myofibers. Therefore, CD90 expression in this panel is not relevant to the analysis presented. We have revised the legend to explicitly clarify this point and avoid potential confusion. Overall, we confirm the reproducibility of the findings and have improved the clarity of presentation in the revised manuscript.

      Minor comments: 1. Figure S4. The authors need to show evidence that these cells are proliferating. Without the evidence, CD90 expression my just be retained in non-dividing cells. If it is difficult, the results should be removed.

      We thank the reviewer for highlighting this possibility, and we have now removed panels D and E from Figure S4.

      1. Heterogeneity in cell cycle progression in MuSCs is well documented as fast and slow dividing cells. This reviewer recommends discussing the relevance of CD90 expression to these reports. PMID: 22349695 PMID: 8608871

      In the revised version of the manuscript, we have included the indicated reports in a paragraph of the discussion centered on a possible “division of labor” between CD90+ve and CD90-ve subsets of MuSCs during muscle regeneration. See also the response to the first major point raised by Reviewer #2.

      Reviewer #2

      Major Comments: 1. It is perplexing that the CD90+ fraction is implicated in activation, proliferation, and differentiation (Mgn+ data) while simultaneously contributing to the CD90-ve population (Fig. 3E). However, the reverse does not seem to occur, with CD90-ve cells not replenishing the CD90+ fraction. If the CD90+ subpopulation indeed accounts for the majority of myogenesis, this provokes the question: what is the functional role of the CD90− fraction? Notably, CD90-ve MuSCs appear to divide effectively during regeneration (Fig. 2E-G), further emphasizing the need to clarify their contribution to the overall regenerative process. The presence of a substantial number of CD90-ve MuSCs across conditions suggests they cannot simply be dismissed as irrelevant and understanding their role will help clearly establish the +/- subpopulations as functionally different.

      We thank the reviewer for raising this important point. We would like to clarify that we do not suggest at any stage that the CD90−ve MuSC population is insignificant or dispensable for regeneration. On the contrary, our data consistently show that CD90−ve MuSCs are numerically substantial across homeostatic and regenerative conditions and retain clear proliferative capacity during muscle repair (Fig. 2E-G). We fully agree that their persistence strongly argues for a functional role. Our study was designed primarily to identify and characterize functional differences between CD90+ve and CD90−ve MuSCs in terms of activation dynamics and quiescence control, rather than to comprehensively define the lineage hierarchy or long-term fate of each subpopulation. In this context, the observation that CD90+ve MuSCs can give rise to CD90−ve cells in vitro (Fig. 3E) suggests a degree of plasticity and may indicate that CD90 expression marks an activation-prone state rather than a rigid lineage boundary. The lack of reciprocal conversion under the tested conditions does not imply that CD90−ve cells are functionally inert, but rather that the two fractions may occupy distinct positions along a continuum of activation states. Importantly, our in vivo data demonstrate that CD90−ve MuSCs do enter the cell cycle during regeneration, albeit with slower kinetics compared to CD90+ve cells. This supports a model in which CD90+ve cells are primed for rapid early activation, while CD90−ve cells may represent a more dormant or reserve-like fraction that contributes to regeneration with delayed kinetics or plays a stabilizing role during regeneration. Such division of labor would be consistent with emerging concepts of functional heterogeneity within stem cell compartments. Similar “division-of-labor” models have already been proposed in other stem cell systems, including muscle, where subsets differ in proliferation kinetics, differentiation, or self-renewal behavior, as well as in other tissues. We have included a dedicated paragraph in the discussion to highlight this aspect in light of classical and recent literature. A detailed dissection of the long-term lineage contribution and self-renewal capacity of the CD90−ve fraction would be highly informative; however, addressing this question would require dedicated clonal tracing and transplantation experiments beyond the scope of the present study. We have now clarified this point in the revised Discussion, explicitly stating that our goal is to highlight differential activation modalities between the two subpopulations rather than to assign exclusive regenerative responsibility to one fraction. Taken together, our findings support the view that CD90+ve and CD90−ve MuSCs represent functionally distinguishable, yet complementary, subpopulations within the muscle stem cell pool, rather than hierarchically “major” versus “minor” or “relevant” versus “irrelevant” fractions.

      1. The depletion of CD90+ cells (Fig. 7D-I) is the correct experimental approach to assess the function of these cells in vivo. However, the method employed, using IP injections of a CD90 antibody, can lack specificity. Even with optimal specificity, CD90 is expressed on numerous cell types across the body. This raises the possibility that observed effects may result from targeting other CD90+ cells in skeletal muscle or other tissues, both locally and systemically. To mitigate these confounding factors, the authors should attempt strategies to reduce off-target effects. While the technical challenges are acknowledged by this reviewer and may be prohibitory, addressing these limitations would substantially enhance the impact of this work. Additionally, the embryonic myosin heavy chain (eMHC) images (Fig. 7G, H) should be more representative of the quantification data to ensure consistency.

      We thank the reviewer for this constructive comment and agree that antibody-mediated depletion strategies may raise concerns regarding specificity. As correctly pointed out, CD90 is expressed in additional cellular compartments beyond MuSCs, a limitation that we have explicitly acknowledged in the revised manuscript. Importantly, the anti-CD90 antibody used in this study is highly specific, as validated by flow cytometry and immunofluorescence analyses (see Fig. S1 and Fig. 1H). Moreover, the same clone (30-H12) has been previously employed by other groups for in vivo depletion approaches with comparable experimental aims, supporting its reliability for targeting CD90+ve cells (Powell et al. 2012; Zhou et al. 2022). While we cannot completely exclude effects on other CD90-expressing cells, our depletion strategy was performed in the context of acute muscle damage, with local intramuscular administration at the time of injury following systemic priming, which may partially limit potential broader systemic confounding effects. The timing of the phenotype - restricted to early regenerative stages - argues in favor of a local MuSCs-related contribution. We agree that genetic or lineage-restricted strategies would provide a more selective approach; however, such models are currently unavailable for selectively targeting CD90+ve MuSCs without affecting other CD90-expressing populations. Finally, as requested, we have replaced the representative eMHC images in Fig. 7G and 7H to better reflect the quantification data and ensure improved consistency between images and measured outcomes.

      1. Similar concerns about off-target effects noted in point #2, apply to the use of the Col6 KO mouse model, which appears to be a full body KO, meaning Col6 is absent not only in MuSCs but also in other cell types that typically express Col6. This deficiency would have been present throughout development, complicating the interpretation of the observed effects. The authors do acknowledge Col6 expression by non-MuSC cell types, but the in vivo impact remains challenging to interpret, particularly due to the potential developmental and systemic effects of removing Col6. Also, the observation that the CD90-ve subpopulation still expresses Calcr raises further questions about Col6 acting only on the CD90+ fraction and expression by MuSCs being consequential in vivo. The trend observed in Fig. 6M for CD90-ve cells suggests that this mechanism might not be exclusive to CD90+ cells, warranting further investigation or explanation since an outlier in the Col6KO CD90-ve group may have influenced interpretation.

      We thank the reviewer for this thoughtful comment and agree that the use of a constitutive Col6a1−/− model introduces interpretative limitations. As correctly noted, Col6 is expressed by multiple cell types, and its absence throughout development may potentially result in niche-level and systemic effects that complicate attribution of the phenotype exclusively to MuSC-derived Col6. We have clarified this point in the revised Discussion and tempered our conclusions accordingly. Importantly, we do not propose that the Collagen VI - CALCR axis operates exclusively in the CD90+ve fraction. CALCR is also expressed, although at lower levels, in CD90−ve MuSCs, and the in vivo data in Col6a1−/− mice indicate that both subpopulations are affected. Indeed, parameters such as MyoD modulation shift in the same direction in both fractions (see Fig. 6L), supporting the idea that this signaling axis is functionally active across the MuSC pool. However, several observations indicate that components of the Col6–CALCR axis are more pronounced in CD90+ve MuSCs and that certain responses are more robust in this fraction. These include higher Col6 and CALCR expression levels (Fig. 6A-C and 6F-I), a more pronounced increase in cell size upon Col6 ablation (Fig. 6K), and a clearer modulation of activation-associated readouts (levels of pAMPK and MyoD, and EdU incorporation) upon inhibition of the Col6-CALCR axis in vitro (see the newly introduced experiments with PKA inhibitor in Fig. S9 G-H of the revised version). Thus, rather than invoking strict differential sensitivity, our data support a quantitative model in which the pathway operates in both subpopulations but with greater amplitude or prominence in CD90+ve MuSCs. Regarding the trend observed in Fig. 6M for CD90−ve cells, we acknowledge that variability might have played a role and have revised the text to avoid overstatement. As mentioned above, we expanded characterization of the Col6-CALCR axis in the two subpopulations by performing additional ex vivo experiments to investigate the activation of the signaling pathway downstream of CALCR and the impact of its pharmacological inhibition on the two subpopulations of MuSCs. Overall, we conclude that the Collagen VI–CALCR axis is not exclusive to CD90+ve MuSCs, but that its components and functional consequences appear particularly evident in this fraction.

      1. The siCD90 experiment in Fig. 5 demonstrates effective KD at both the transcript and protein levels, but the observed impact on the proliferation of CD90+ cells (Fig. 5G), while statistically significant, appears to be less than expected. This result is also confusing given the substantial reduction in pAMPK levels observed in Fig. 5L, leading to the expectation of a more pronounced effect on proliferation if the proposed CD90-pAMPK mechanism is a driving pathway. Additionally, Fig. 5N suggests that pAMPK supports proliferation in both CD90+ and CD90− subpopulations. While the AICAR treatment in CD90− cells does not achieve significance, the data exhibit a bimodal distribution among replicates, with an apparent outlier in the control group potentially skewing the analysis. This variability necessitates further clarification for the relationship between CD90, pAMPK, and MuSC proliferation.

      We thank the reviewer for this careful evaluation of the siRNA and AICAR experiments. First, to improve clarity and better reflect the relationship between CD90 knockdown efficiency and biological outcome, we have now re-expressed the proliferation data in Fig. 5 F, G, L as percentage reduction relative to the CD90+ve scrambled control condition. Presenting the data in this normalized manner highlights a strong correlation between the extent of CD90 ablation (Fig. 5E) and the magnitude of the physiological effect. While the absolute change in proliferation may appear moderate, it scales consistently with the degree of CD90 protein reduction, supporting a dose-effect relationship rather than an all-or-none response. Our data support the interpretation that the CD90-AMPK axis contributes to, but does not solely determine, proliferative behavior. MuSC proliferation is likely governed by multiple converging pathways, and CD90-dependent modulation of AMPK represents an important component of this regulatory network. To further strengthen the mechanistic link, we have complemented the primary MuSCs data with gain-of-function experiments in C2C12 reserve cells (see Fig. S6 in this revised version). Overexpression of CD90 in this model enhances activation-associated features, including increased Myod upregulation, pAMPK levels, and augmented entry into the cell cycle upon stimulation, compared to control untransduced cells. These findings provide independent evidence that CD90 expression is sufficient to potentiate AMPK signaling and bias cells toward a more activation-prone state, supporting the causal nature of the CD90–AMPK axis beyond observations in primary MuSCs. Concerning the AICAR experiments (Fig. 5N), we acknowledge that in vivo pharmacological activation can be influenced by biological variability. It is indeed possible that AICAR exerts effects on both CD90+ve and CD90−ve populations, as AMPK is present in both fractions (Additional Fig. 2).

      1. The CD90 related findings in human samples appear less robust compared to those in mice. While the sorting successfully identifies sizable CD90+ and CD90-ve populations (Fig. 4A), the sequencing data show only small regions of high CD90 expression, as highlighted in red by the authors (Fig. 4C, D). Have the authors considered replicating the sequencing experiments within their own laboratory? While it is acknowledged that sourcing human tissue may be a limitation, it may strengthen the translational impact if possible.

      We thank the reviewer for this thoughtful comment. We agree that the CD90-related signal in the human scRNA-seq dataset appears less striking than in the murine Cy-TOF data; however, we believe that the most parsimonious explanation lies in the well-documented technical limitations of single-cell RNA sequencing, particularly its limited sensitivity for low-abundance transcripts. It is widely recognized that in scRNAseq experiments, the number of genes detected depends on the sequencing depth, and therefore scRNA-seq suffers from “dropout” effects and reduced detection efficiency, especially for transcripts expressed at moderate-to-low levels, such as those that do not encode abundant structural proteins (Kharchenko et al. 2014; Svensson et al. 2017). CD90 falls within this category, as it is not an abundant structural protein and may therefore be underrepresented at the mRNA level despite robust protein detection with FACS. Indeed, discrepancies between protein-level heterogeneity and scRNA-seq signal intensity are commonly reported, particularly for surface markers (Linderman et al. 2022; Stuart and Satija 2019). Importantly, in our study, the presence of substantial CD90+ve and CD90−ve human MuSC populations is robustly demonstrated by flow cytometry-based sorting (Fig. 4A-B), which directly measures protein abundance and shows a clear bimodal distribution. The scRNA-seq data were used as supportive, orthogonal evidence and are consistent with enrichment of CD90-expressing clusters, even if the signal is spatially restricted. Furthermore, functional assays (ex vivo EdU incorporation and activation parameters following injury) independently validate that CD90 marks a functionally distinct fraction in human muscle. While we agree that performing scRNA-seq in-house would be valuable, access to freshly isolated human MuSCs in sufficient numbers for high-depth single-cell sequencing remains technically and ethically challenging. Given that our conclusions rely primarily on protein-level stratification and functional validation, we believe that the translational relevance of the findings is adequately supported, and we hope the Reviewer will agree.

      Minor Comments: 1. Fig. 1D - the MuSC population has an uncharacteristically low representation amongst cells of uninjured muscle. Can the authors comment on this in text?

      We thank the reviewer for raising this point. We have re-examined our calculations on a per-sample basis, and the proportion of MuSCs among total mononuclear cells isolated from uninjured muscle ranges between approximately 0.8% and 3%. This frequency is within the lower end of the range typically reported for MuSCs isolated by FACS from adult uninjured murine muscle, which is commonly described to fall around ~1-4% depending on digestion protocol, gating strategy, and muscle type (Liu et al. 2015; Machado et al. 2017; Montarras et al. 2005). Importantly, we intentionally applied a conservative gating strategy to minimize contamination from non-myogenic populations and to ensure that CD90 detection was strictly restricted to bona fide MuSCs. While this approach may reduce the apparent overall frequency of MuSCs, it increases confidence in the purity of the analyzed population and in the interpretation of CD90-based subfractionation. To improve clarity and transparency, we have now included an additional figure (Additional Fig. 3) below detailing the full gating scheme, along with absolute numbers and percentages of MuSCs across samples. We have also added explanatory text in the legend of Figure 1D of the revised manuscript to explicitly address this point.

      1. Fig. 2 - the g-alert phenotype corresponding with CD90 expression is interesting. Can the authors add a molecular marker to confirm this phenotype?

      We have now added to our size, rosa locus activity, and mitochondrial content analysis the quantification of phosphorylated S6, a marker of cells in the alert state according to preexisting literature (Rodgers et al. 2014) (see Fig. 2J of the revised version). The more pronounced presence of phospho-S6 in CD90+ve MuSCs under “alerting” conditions supports our conclusion that CD90+ve MuSCs present a more rapid tendency to enter the G0alert state compared to their CD90-ve counterpart.

      1. The authors mention "significantly higher fraction of CD90+ve MuSCs incorporated EdU in vivo at 1.5, 2.75, and 6 days after injury..." in 2G, but it only seems 1.5 and 2.75 are different. The text should be corrected.

      We apologize for the mistake and thank the Reviewer for noticing this typo, which we have corrected in the current version of the manuscript.

      1. 3B, C - 7dpi seems late for the analyses of Myogenin at a single-cell level since most differentiating MuSCs are fused at this point. Can the authors comment in the text why 7 dpi was chosen?

      In adult murine skeletal muscle, most of the events associated to the regenerative process following acute injury typically unfolds over approximately 14 days, with early activation and proliferation occurring within the first few days, followed by differentiation and fusion, and progressive maturation of newly formed fibers until ~30 days post-injury. Thus, 7 days post-injury represents an intermediate stage of the regenerative timeline, during which differentiation is still ongoing and newly formed fibers are not yet fully matured. At this time point, Myogenin-positive cells can still be detected, reflecting ongoing differentiation of MuSC progeny before complete fusion and maturation. Told that, the major goal of this experiment was to begin exploring potential mechanisms restoring the initial 1:1 ratio between CD90+ve and CD90-ve MuSCs after injury resolution. Our observations reported in Fig. 3A suggest that ~6 days post-injury, the reequilibration process has started. The choice of 7 days post-injury for the myogenin analysis is a consequence of this observation. Moreover, around this time-point, CD90+ve and CD90-ve MuSCs showed similar EdU incorporation rates (see Fig. 2G). We have clarified this rationale in the revised manuscript to better contextualize the timing of the analysis.

      1. Note - 7L is very interesting, but emphasizes the major Q#1 above. Mdx mice are known to lose MuSC capacity due to continuous rounds of proliferation. So, the CD90+ loss is supportive, but there are still many CD90-ve cells present. In other figures, the authors demonstrated the negative fractions still harbor decent proliferative potential, so why is there no rescue?

      We thank the reviewer for highlighting this important aspect of our findings and for recognizing the relevance of the observation in Fig. 7L. We agree that in mdx mice, where MuSCs undergo repeated cycles of activation and proliferation, the preferential loss of the CD90+ve subset is consistent with the idea that this subpopulation may be particularly vulnerable to chronic regenerative stress. At the same time, we acknowledge that a substantial number of CD90−ve MuSCs remain present. As the Reviewer correctly notes, our data indicate that CD90−ve cells retain proliferative potential under acute regenerative conditions. However, our study was designed primarily to define differences in activation dynamics and quiescence control between CD90+ve and CD90−ve MuSCs, rather than to establish their relative capacity to sustain long-term regeneration in pathological contexts. Understanding why the remaining CD90−ve population does not compensate for the loss of CD90+ve cells in mdx muscle would require dedicated lineage-tracing, transplantation, or long-term functional assays, which go beyond the scope of the present work. One possible interpretation, consistent with our data, is that CD90−ve MuSCs exhibit slower activation kinetics and may not efficiently support the rapid or repeated regenerative demands characteristic of dystrophic muscle. Thus, they may be insufficient to fully rescue regeneration under chronic pathological stress. Future studies specifically addressing the regenerative potential of purified CD90−ve MuSCs in transplantation or chronic injury models will be required to resolve this question in detail. We have clarified this point in the revised Discussion and tempered our interpretation accordingly.

      1. The authors should clearly state the number of cells and number of replicates for their single cell distribution graphs in all legends.

      We have added this information to all relevant legends, where it was not already present.

      1. In their stats section of methods, ns= p{greater than or equal to}0.15, please clarify.

      We thank the reviewer for pointing this out and apologize for the lack of clarity. The threshold “ns ≥ 0.15” was introduced as an arbitrary and conservative criterion to avoid labeling as “non-significant” comparisons with p values only marginally above the conventional 0.05 threshold. Our intention was to distinguish likely truly non-significant results (p ≥ 0.15) from those showing a statistical trend (0.05

      Reviewer #3

      Major comments 1. While the manuscript provides valuable insights into the functional heterogeneity of MuSCs, there are some critical aspects that remain unclear. Specifically, the mechanism by which the quiescence of CD90+ MuSCs, maintained through the COL6-CALCR pathway, confers an advantage for their rapid activation is not sufficiently addressed. Understanding why this pathway enables such responsiveness would significantly strengthen the authors' conclusions. Additionally, the manuscript does not elucidate how the quiescence of CD90-negative MuSCs is maintained, leaving a gap in the characterization of MuSC heterogeneity. Without this information, the functional significance of this heterogeneity, particularly in the context of muscle regeneration, remains incomplete.It would be interesting to explore why CD90-negative cells appear less responsive to injury or why CD90+ cells are more readily activated. Addressing these questions would provide a more comprehensive understanding of the biological implications of MuSC heterogeneity and enhance the impact of the study.

      We thank the reviewer for this thoughtful and conceptually important comment. We agree that clarifying the mechanistic basis underlying the differential activation propensity of CD90+ve and CD90−ve MuSCs strengthens the interpretation of functional heterogeneity. To address this point, we have expanded the mechanistic component of the study in two directions. First, to better understand why CD90+ve MuSCs display a more pronounced activation profile, we performed gain-of-function experiments in C2C12 reserve cells by overexpressing CD90. These experiments, reported in Figure S6 of this revised version, demonstrate that CD90 overexpression enhances activation-associated features, supporting a causal link between CD90 expression and activation propensity. This complements the loss-of-function data in primary MuSCs and reinforces the concept of a CD90-AMPK axis contributing to a primed metabolic state. Second, regarding quiescence, we have further dissected the COL6-CALCR pathway by analyzing downstream signaling components and comparing its mechanistic features with the previously described COL5-CALCR axis (Baghdadi et al. 2018). Our new data show that COL6 engagement leads to modulation of downstream effectors (e.g., YAP localization and PKA-dependent signaling), consistent with a CALCR-mediated quiescence program that shares similarities with COL5-driven regulation (see Fig.S9 of the revised version). We have expanded the Discussion to more clearly articulate this mechanistic convergence. Importantly, we do not propose that COL6-mediated quiescence directly “confers” activation capacity in a deterministic sense. Rather, our model suggests that CD90+ve MuSCs exist in a poised state: they exhibit an intrinsically primed activation program (via CD90-AMPK), while concurrently maintaining quiescence through a COL6-CALCR-dependent restraint. This dual regulatory architecture may allow rapid transition upon injury without premature exhaustion, thereby providing a kinetic advantage. In this context, the destruction of muscle extracellular architecture associated with injury would release the “break” imposed by Collagen 6 on the activation of CD90+ve cells. Regarding CD90−ve MuSCs, we acknowledge that the precise mechanisms maintaining their quiescence remain incompletely defined in this study. However, our data suggest that they are less reliant on the COL6–CALCR axis. A full dissection of these pathways would require dedicated transcriptional and signaling analyses beyond the scope of the present work. We have clarified this aspect in the revised Discussion. Finally, we have further elaborated in the Discussion that CD90+ve and CD90−ve MuSCs may represent functionally complementary subpopulations: CD90+ve cells being primed for rapid early activation, and CD90−ve cells potentially contributing under different temporal or regenerative contexts. We believe that these additions provide a more comprehensive framework for understanding the biological implications of MuSC heterogeneity while maintaining appropriate caution regarding unresolved mechanistic aspects.

      1. I am particularly concerned about Figure 7. As the authors mentioned, CD90 is not specific to MuSCs. Therefore, the conclusion that CD90+ MuSCs are important for muscle regeneration based on the current experiment is not fully convincing. I suggest incorporating additional approaches to confirm this point. For example, transplantation of CD90+ or CD90- MuSCs into injured muscles would provide stronger support for their findings.

      We thank the reviewer for raising this important concern, which was also emphasized by Reviewer #2. We fully agree that the lack of absolute specificity of CD90 for MuSCs represents a limitation when interpreting in vivo depletion experiments. A number of observations somehow mitigate the concern (see answer to point #2 of Reviewer 2). At present, however, there are no available genetic tools that would allow selective targeting of the CD90+ve MuSC subpopulation. Addressing this question definitively would likely require the generation of compound mouse models combining at least two independent genetic modifications (e.g., a MuSC-specific driver together with a CD90-dependent conditional ablation system), which are currently not available. We have explicitly clarified this limitation in the revised manuscript. Regarding the suggestion of transplantation experiments, we agree that this approach is often used to assess regenerative potential. However, in the specific context of our study, transplantation may not directly resolve the key mechanistic question we are addressing. Indeed, isolation and transplantation procedures inevitably activate MuSCs due to enzymatic digestion and removal from their niche, thereby erasing differences related to activation kinetics and quiescence maintenance. Since the central focus of our work is the differential propensity for activation and the regulation of quiescence between CD90+ve and CD90−ve subpopulations, transplantation of already activated cells may obscure precisely the phenotypic differences we aim to characterize. Importantly, our conclusions do not rely solely on the depletion experiment in Fig. 7. The functional relevance of CD90+ve MuSCs is supported by multiple complementary lines of evidence, including differences in activation kinetics, AMPK signaling, response to Collagen VI, and their preferential depletion in dystrophic muscle. The in vivo antibody-mediated depletion, therefore, serves as supportive, rather than exclusive, functional validation. We have revised the Discussion to explicitly acknowledge these technical constraints (and, therefore, temper our conclusions), while emphasizing that the available evidence supports a functional contribution of the CD90+ve fraction to early regenerative dynamics.

      1. A previous study has demonstrated that the COL5-CALCR pathway is essential for maintaining MuSC quiescence. In this manuscript, the authors propose the COL6-CALCR pathway; however, the current study lacks specific experiments to clarify the differences and similarities between these pathways. Additionally, the discussion section does not adequately address these points, leaving the interpretation incomplete. A more thorough discussion comparing these pathways would significantly improve the manuscript.

      We thank the reviewer for highlighting this important conceptual point. We agree that a clearer comparison between the previously described Col5-CALCR pathway and the Col6-CALCR axis proposed in our study strengthens the interpretation of our findings. To directly address this issue, we have now included two additional sets of experiments in the revised manuscript. First, we performed a comparative analysis of Col5, Col6, and Col4 as substrates, evaluating their ability to activate downstream CALCR signaling, using the reduction of nuclear YAP accumulation as a functional readout (Zhang et al. 2019). Although only the effect induced by Col6 was statistically different from those induced by Col 4, these experiments suggest that both collagen 5 and 6 can activate this pathway (see Fig. S9E-F of the revised manuscript). Second, we assessed activation and proliferation parameters in freshly isolated CD90+ve MuSCs plated on Col5, Col6, or Col4 substrates. This allowed us to directly compare the functional consequences of these different ECM components on activation kinetics and proliferative behavior within the same experimental framework. The results indicate overlapping effects of Col6 and Col5, distinct from those induced by Col4. These observations support the idea that Col6 contributes to quiescence regulation in a manner that is at least partially convergent with the previously described Col5 pathway (see Fig S9 A-B of the revised manuscript). Importantly, the commune effect induced by Col6 and Col5 appears to be specific, as Col1 behaves similarly to Col4 under similar testing conditions (see Fig S9 C-D of the revised manuscript). In addition to incorporating these new data, we have substantially expanded the Discussion to more thoroughly compare the COL5-CALCR and COL6-CALCR axes, emphasizing both shared mechanisms (CALCR engagement and quiescence modulation) and potential differences in expression patterns, subpopulation bias, and magnitude of response. We believe that these additions significantly clarify the relationship between the two pathways and strengthen the overall mechanistic framework of the manuscript.

      Minor comments 1. It would be interesting to see the spatial localization of CD90+ and CD90- MuSCs in skeletal muscle tissue.

      We would like to point out that the spatial localization of CD90+ve and CD90−ve MuSCs within skeletal muscle tissue is already shown in Fig. 1H and Fig.6B, where immunofluorescence analysis of muscle cryosections demonstrates the presence of both subpopulations in their native niche. In these images, CD90 staining is visualized in combination with MuSC markers, allowing identification of CD90+ve and CD90−ve MuSCs in situ. We have added to this response to Reviewers additional examples, in which Col6 staining is also highlighted (Additional Fig. 4). To improve clarity, we have revised the legend for Fig. 1H to more explicitly highlight this aspect and guide the reader to the relevant panel.

      1. I suggest conducting a more thorough investigation to characterize the quiescent CD90+ and CD90- MuSCs, particularly focusing on aspects such as protein translation machinery.

      We thank the reviewer for this insightful suggestion. We agree that a deeper characterization of quiescent CD90+ve and CD90−ve MuSCs, including analysis of protein translation machinery and related metabolic features, would provide valuable additional insight into the molecular basis of their functional differences. However, such an in-depth investigation would require dedicated molecular investigations, such as proteomic and/or ribosome profiling approaches, and goes beyond the scope of the present study, which is focused primarily on differential activation dynamics and quiescence regulation between the two subpopulations. We believe this represents an important and promising direction for future work.

      1. The authors should include a discussion of previously identified markers of MuSC heterogeneity, such as CD34, to provide better context for their findings.

      We thank the reviewer for this helpful suggestion. We agree that placing our findings in the context of previously described markers of MuSC heterogeneity is important. In the manuscript, we have explicitly evaluated the relationship between CD90+ve and CD90−ve MuSCs and previously reported heterogeneity markers, including CD34. These analyses are presented in Fig. S2, where we show that CD90-based stratification does not simply recapitulate previously defined subsets. For clarity, we have also added a new summary table (Additional Table 1 below) highlighting the limited overlap between CD90-defined fractions and other reported markers of MuSC heterogeneity. Furthermore, we have expanded the Discussion to note that CD90 does not align with markers such as CD34 and others described in the literature, emphasizing that CD90 identifies a functionally distinct layer of heterogeneity, primarily related to activation kinetics and quiescence regulation, rather than directly overlapping with previously characterized subpopulations.

      1. In Figure 6, the authors used COL4 as a negative control; however, this is insufficient to conclusively demonstrate the importance of COL6 in maintaining CD90+ MuSC quiescence. Including additional substrates beyond collagen, such as fibronectin or laminin, along with COL5, would strengthen the conclusions drawn from these experiments.

      We thank the reviewer for this valuable suggestion. We agree that expanding the range of substrates strengthens the interpretation of the role of COL6 in regulating MuSC quiescence. In the revised manuscript, we have now included additional comparative experiments using Collagen V and Collagen I as alternative substrates. Collagen V, consistent with previous reports implicating the COL5–CALCR axis in MuSC quiescence, produced effects qualitatively similar to those observed with Collagen VI, supporting a partially convergent mechanism at the level of CALCR signaling (see also the response to major comment #3 above) (Fig. S9A-B of the revised version). In contrast, Collagen I was less effective at promoting quiescence-associated features in CD90+ve MuSCs, yielding results similar to Col4 in terms of EdU incorporation and expression of pAMPK (see Fig. S9C-D of the revised version). These findings reinforce the idea that Collagen VI (and Collagen V) are not merely generic ECM components, but exert specific regulatory effects on MuSC activation dynamics. We have incorporated these new data into Figure S9, where we have also created a graphical scheme to summarize and better contextualize similarities and differences among ECM substrates in shaping MuSC behavior (Fig. S9J).

      ADDITIONAL REFERENCES Almada AE, Horwitz N, Price FD, Gonzalez AE, Ko M, Bolukbasi OV, Messemer KA, Chen S, Sinha M, Rubin LL, et al. 2021. FOS licenses early events in stem cell activation driving skeletal muscle regeneration. Cell Reports 34: 108656. Baghdadi MB, Castel D, Machado L, Fukada S-I, Birk DE, Relaix F, Tajbakhsh S, Mourikis P. 2018. Reciprocal signalling by Notch-Collagen V-CALCR retains muscle stem cells in their niche. Nature 557: 714–718. Beauchamp JR, Heslop L, Yu DSW, Tajbakhsh S, Kelly RG, Wernig A, Buckingham ME, Partridge TA, Zammit PS. 2000. Expression of CD34 and Myf5 defines the majority of quiescent adult skeletal muscle satellite cells. Journal of Cell Biology 151: 1221–1233. Cerletti M, Jurga S, Witczak CA, Hirshman MF, Shadrach JL, Goodyear LJ, Wagers AJ. 2008. Highly Efficient, Functional Engraftment of Skeletal Muscle Stem Cells in Dystrophic Muscles. Cell 134: 37–47. Chakkalakal JV, Christensen J, Xiang W, Tierney MT, Boscolo FS, Sacco A, Brack AS. 2014. Early forming label-retaining muscle stem cells require p27kip1 for maintenance of the primitive state. Development (Cambridge, England) 141: 1649–59. Chakkalakal JV, Jones KM, Basson MA, Brack AS. 2012. The aged niche disrupts muscle stem cell quiescence. Nature 490: 355–360. de Morree A, Klein JDD, Gan Q, Farup J, Urtasun A, Kanugovi A, Bilen B, van Velthoven CTJ, Quarta M, Rando TA. 2019. Alternative polyadenylation of Pax3 controls muscle stem cell fate and muscle function. Science 366: 734–738. Der Vartanian A, Quétin M, Michineau S, Auradé F, Hayashi S, Dubois C, Rocancourt D, Drayton-Libotte B, Szegedi A, Buckingham M, et al. 2019. PAX3 Confers Functional Heterogeneity in Skeletal Muscle Stem Cell Responses to Environmental Stress. Cell Stem Cell 24: 958-973.e9. Florio F, Vencato S, Papa FT, Libergoli M, Kheir E, Ghzaiel I, Rando TA, Torrente Y, Biressi S. 2023. Combinatorial activation of the WNT ‐dependent fibrogenic program by distinct complement subunits in dystrophic muscle. EMBO Molecular Medicine 15: 1–20. García-Prat L, Perdiguero E, Alonso-Martín S, Dell’Orso S, Ravichandran S, Brooks SR, Juan AH, Campanario S, Jiang K, Hong X, et al. 2020. FoxO maintains a genuine muscle stem-cell quiescent state until geriatric age. Gayraud-Morel B, Chrétien F, Jory A, Sambasivan R, Negroni E, Flamant P, Soubigou G, Coppée J-Y, Di Santo J, Cumano A, et al. 2012. Myf5 haploinsufficiency reveals distinct cell fate potentials for adult skeletal muscle stem cells. Journal of cell science 125: 1738–1749. Guardiola O, Iavarone F, Nicoletti C, Ventre M, Rodríguez C, Pisapia L, Andolfi G, Saccone V, Patriarca EJ, Puri PL, et al. 2023. CRIPTO-based micro-heterogeneity of mouse muscle satellite cells enables adaptive response to regenerative microenvironment. Developmental Cell 58: 2896-2913.e6. Kharchenko PV, Silberstein L, Scadden DT. 2014. Bayesian approach to single-cell differential expression analysis. Nat Methods 11: 740–742. Kuang S, Kuroda K, Le Grand F, Rudnicki MA. 2007. Asymmetric Self-Renewal and Commitment of Satellite Stem Cells in Muscle. Cell 129: 999–1010. Linderman GC, Zhao J, Roulis M, Bielecki P, Flavell RA, Nadler B, Kluger Y. 2022. Zero-preserving imputation of single-cell RNA-seq data. Nature Communications 13: 1–11. Liu L, Cheung TH, Charville GW, Rando TA. 2015. Isolation of skeletal muscle stem cells by fluorescence-activated cell sorting. Nature Protocols 10: 1612–1624. Machado L, Esteves de Lima J, Fabre O, Proux C, Legendre R, Szegedi A, Varet H, Ingerslev LR, Barrès R, Relaix F, et al. 2017. In Situ Fixation Redefines Quiescence and Early Activation of Skeletal Muscle Stem Cells. Cell Reports 21: 1982–1993. Montarras D, Morgan J, Collins C, Relaix F, Zaffran S, Cumano A, Partridge T, Buckingham M. 2005. Direct isolation of satellite cells for skeletal muscle regeneration. Science (New York, NY) 309: 2064–2067. Powell N, Walker AW, Stolarczyk E, Canavan JB, Gökmen MR, Marks E, Jackson I, Hashim A, Curtis MA, Jenner RG, et al. 2012. The Transcription Factor T-bet Regulates Intestinal Inflammation Mediated by Interleukin-7 Receptor+ Innate Lymphoid Cells. Immunity 37: 674–684. Relaix F, Montarras D, Zaffran S, Gayraud-Morel B, Rocancourt D, Tajbakhsh S, Mansouri A, Cumano A, Buckingham M. 2006. Pax3 and Pax7 have distinct and overlapping functions in adult muscle progenitor cells. Journal of Cell Biology 172: 91–102. Rocheteau P, Gayraud-Morel B, Siegl-Cachedenier I, Blasco MA, Tajbakhsh S. 2012. A subpopulation of adult skeletal muscle stem cells retains all template DNA strands after cell division. Cell 148: 112–125. Rodgers JT, King KY, Brett JO, Cromie MJ, Charville GW, Maguire KK, Brunson C, Mastey N, Liu L, Tsai CR, et al. 2014. MTORC1 controls the adaptive transition of quiescent stem cells from G 0 to GAlert. Nature 510: 393–396. Scaramozza A, Park D, Kollu S, Beerman I, Sun X, Rossi DJ, Lin CP, Scadden DT, Crist C, Brack AS. 2019. Lineage Tracing Reveals a Subset of Reserve Muscle Stem Cells Capable of Clonal Expansion under Stress. Cell Stem Cell 24: 944-957.e5. Stuart T, Satija R. 2019. Integrative single-cell analysis. Nat Rev Genet 20: 257–272. Svensson V, Natarajan KN, Ly L-H, Miragaia RJ, Labalette C, Macaulay IC, Cvejic A, Teichmann SA. 2017. Power analysis of single-cell RNA-sequencing experiments. Nat Methods 14: 381–387. Vetter TA, Lawlor MW. 2026. Automated Quantification of Dystrophin Expression by Immunofluorescence in Humans and Animal Models. Methods Mol Biol 2975: 67–87. Zanotti S, Magri F, Poggetti F, Ripolone M, Velardo D, Fortunato F, Ciscato P, Moggio M, Corti S, Comi GP, et al. 2022. Immunofluorescence signal intensity measurements as a semi-quantitative tool to assess sarcoglycan complex expression in muscle biopsy. Eur J Histochem 66: 3418. Zhang L, Noguchi Y-T, Nakayama H, Kaji T, Tsujikawa K, Ikemoto-Uezumi M, Uezumi A, Okada Y, Doi T, Watanabe S, et al. 2019. The CalcR-PKA-Yap1 Axis Is Critical for Maintaining Quiescence in Muscle Stem Cells. Cell Rep 29: 2154-2163.e5. Zhou L, Zhou W, Joseph AM, Chu C, Putzel GG, Fang B, Teng F, Lyu M, Yano H, Andreasson KI, et al. 2022. Group 3 innate lymphoid cells produce the growth factor HB-EGF to protect the intestine from TNF-mediated inflammation. Nature immunology 23: 251–261.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: In this manuscript, the authors explore the heterogeneity of adult muscle stem cells (MuSCs) in murine and human skeletal muscles. They identify diverse expression levels of CD90 on MuSCs and demonstrate that CD90+ MuSCs are primed for myogenic commitment during muscle regeneration. The authors show that CD90+ MuSCs rapidly enter the cell cycle upon MYOD activation, mediated by rapid AMPK phosphorylation. Furthermore, they investigate the characteristics of both CD90+ and CD90- MuSCs in the quiescent state, revealing that the quiescence of CD90+ MuSCs is maintained through the COL6-CALCR pathway-an original finding of this study. Lastly, using an antibody-mediated depletion model targeting CD90+ MuSCs, they confirm the critical role of this population in muscle regeneration. Antibody-induced depletion or imbalance of CD90+ MuSCs results in impaired or delayed muscle regeneration.

      Major comments

      1. While the manuscript provides valuable insights into the functional heterogeneity of MuSCs, there are some critical aspects that remain unclear. Specifically, the mechanism by which the quiescence of CD90+ MuSCs, maintained through the COL6-CALCR pathway, confers an advantage for their rapid activation is not sufficiently addressed. Understanding why this pathway enables such responsiveness would significantly strengthen the authors' conclusions. Additionally, the manuscript does not elucidate how the quiescence of CD90-negative MuSCs is maintained, leaving a gap in the characterization of MuSC heterogeneity. Without this information, the functional significance of this heterogeneity, particularly in the context of muscle regeneration, remains incomplete.It would be interesting to explore why CD90-negative cells appear less responsive to injury or why CD90+ cells are more readily activated. Addressing these questions would provide a more comprehensive understanding of the biological implications of MuSC heterogeneity and enhance the impact of the study.
      2. I am particularly concerned about Figure 7. As the authors mentioned, CD90 is not specific to MuSCs. Therefore, the conclusion that CD90+ MuSCs are important for muscle regeneration based on the current experiment is not fully convincing. I suggest incorporating additional approaches to confirm this point. For example, transplantation of CD90+ or CD90- MuSCs into injured muscles would provide stronger support for their findings.
      3. A previous study has demonstrated that the COL5-CALCR pathway is essential for maintaining MuSC quiescence. In this manuscript, the authors propose the COL6-CALCR pathway; however, the current study lacks specific experiments to clarify the differences and similarities between these pathways. Additionally, the discussion section does not adequately address these points, leaving the interpretation incomplete. A more thorough discussion comparing these pathways would significantly improve the manuscript.

      Minor comments

      1. It would be interesting to see the spatial localization of CD90+ and CD90- MuSCs in skeletal muscle tissue.
      2. I suggest conducting a more thorough investigation to characterize the quiescent CD90+ and CD90- MuSCs, particularly focusing on aspects such as protein translation machinery.
      3. The authors should include a discussion of previously identified markers of MuSC heterogeneity, such as CD34, to provide better context for their findings.
      4. In Figure 6, the authors used COL4 as a negative control; however, this is insufficient to conclusively demonstrate the importance of COL6 in maintaining CD90+ MuSC quiescence. Including additional substrates beyond collagen, such as fibronectin or laminin, along with COL5, would strengthen the conclusions drawn from these experiments.

      Significance

      Overall, this study provides a unique perspective on the functional heterogeneity of MuSCs by using CD90 as a marker to delineate distinct MuSC subpopulations. This approach sheds light on the specific roles of CD90+ MuSCs in muscle regeneration and offers new insights into the regulatory mechanisms governing MuSC function.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Kheir et al. explore the heterogeneity within the MuSC compartment, identifying a CD90+ subpopulation with enhanced activation and proliferative capacity compared to its CD90-ve counterpart. While functional heterogeneity in MuSCs is a well-recognized and intriguing area of study, prior research has often focused on differences inferred from transcriptional profiles. This current study advances the field by intriguingly linking CD90 expression to distinct functional outcomes, thereby providing more compelling evidence for the existence of this subpopulation. The authors further investigate potential mechanisms, suggesting a connection between CD90 expression and intrinsic pAMPK activity as a driver of proliferation, as well as a role for Col6-Calcr binding in maintaining quiescence. While the majority of experiments are rigorously conducted, additional studies are suggested to determine whether the CD90+ fraction represents a subpopulation with substantial functional relevance. Detailed critiques and recommendations for further studies are outlined below.

      Major Comments:

      1. It is perplexing that the CD90+ fraction is implicated in activation, proliferation, and differentiation (Mgn+ data) while simultaneously contributing to the CD90-ve population (Fig. 3E). However, the reverse does not seem to occur, with CD90-ve cells not replenishing the CD90+ fraction. If the CD90+ subpopulation indeed accounts for the majority of myogenesis, this provokes the question: what is the functional role of the CD90− fraction? Notably, CD90-ve MuSCs appear to divide effectively during regeneration (Fig. 2E-G), further emphasizing the need to clarify their contribution to the overall regenerative process. The presence of a substantial number of CD90-ve MuSCs across conditions suggests they cannot simply be dismissed as irrelevant and understanding their role will help clearly establish the +/- subpopulations as functionally different.
      2. The depletion of CD90+ cells (Fig. 7D-I) is the correct experimental approach to assess the function of these cells in vivo. However, the method employed, using IP injections of a CD90 antibody, can lack specificity. Even with optimal specificity, CD90 is expressed on numerous cell types across the body. This raises the possibility that observed effects may result from targeting other CD90+ cells in skeletal muscle or other tissues, both locally and systemically. To mitigate these confounding factors, the authors should attempt strategies to reduce off-target effects. While the technical challenges are acknowledged by this reviewer and may be prohibitory, addressing these limitations would substantially enhance the impact of this work. Additionally, the embryonic myosin heavy chain (eMHC) images (Fig. 7G, H) should be more representative of the quantification data to ensure consistency.
      3. Similar concerns about off-target effects noted in point #2, apply to the use of the Col6 KO mouse model, which appears to be a full body KO, meaning Col6 is absent not only in MuSCs but also in other cell types that typically express Col6. This deficiency would have been present throughout development, complicating the interpretation of the observed effects. The authors do acknowledge Col6 expression by non-MuSC cell types, but the in vivo impact remains challenging to interpret, particularly due to the potential developmental and systemic effects of removing Col6. Also, the observation that the CD90-ve subpopulation still expresses Calcr raises further questions about Col6 acting only on the CD90+ fraction and expression by MuSCs being consequential in vivo. The trend observed in Fig. 6M for CD90-ve cells suggests that this mechanism might not be exclusive to CD90+ cells, warranting further investigation or explanation since an outlier in the Col6KO CD90-ve group may have influenced interpretation.
      4. The siCD90 experiment in Fig. 5 demonstrates effective KD at both the transcript and protein levels, but the observed impact on the proliferation of CD90+ cells (Fig. 5G), while statistically significant, appears to be less than expected. This result is also confusing given the substantial reduction in pAMPK levels observed in Fig. 5L, leading to the expectation of a more pronounced effect on proliferation if the proposed CD90-pAMPK mechanism is a driving pathway. Additionally, Fig. 5N suggests that pAMPK supports proliferation in both CD90+ and CD90− subpopulations. While the AICAR treatment in CD90− cells does not achieve significance, the data exhibit a bimodal distribution among replicates, with an apparent outlier in the control group potentially skewing the analysis. This variability necessitates further clarification for the relationship between CD90, pAMPK, and MuSC proliferation.
      5. The CD90 related findings in human samples appear less robust compared to those in mice. While the sorting successfully identifies sizable CD90+ and CD90-ve populations (Fig. 4A), the sequencing data show only small regions of high CD90 expression, as highlighted in red by the authors (Fig. 4C, D). Have the authors considered replicating the sequencing experiments within their own laboratory? While it is acknowledged that sourcing human tissue may be a limitation, it may strengthen the translational impact if possible.

      Minor Comments:

      1. Fig. 1D - the MuSC population has an uncharacteristically low representation amongst cells of uninjured muscle. Can the authors comment on this in text?
      2. Fig. 2 - the g-alert phenotype corresponding with CD90 expression is interesting. Can the authors add a molecular marker to confirm this phenotype?
      3. The authors mention "significantly higher fraction of CD90+ve MuSCs incorporated EdU in vivo at 1.5, 2.75, and 6 days after injury..." in 2G, but it only seems 1.5 and 2.75 are different. The text should be corrected.
      4. 3B, C - 7dpi seems late for the analyses of Myogenin at a single-cell level since most differentiating MuSCs are fused at this point. Can the authors comment in the text why 7 dpi was chosen?
      5. Note - 7L is very interesting, but emphasizes the major Q#1 above. Mdx mice are known to lose MuSC capacity due to continuous rounds of proliferation. So, the CD90+ loss is supportive, but there are still many CD90-ve cells present. In other figures, the authors demonstrated the negative fractions still harbor decent proliferative potential, so why is there no rescue?
      6. The authors should clearly state the number of cells and number of replicates for their single cell distribution graphs in all legends.
      7. In their stats section of methods, ns= p{greater than or equal to}0.15, please clarify.

      Significance

      General Assessment: The study is well conducted and addresses MuSC functional heterogeneity. There seems to be substantial evidence that CD90 fractionates the MuSC population and is related to proliferative capacity. Functional assessment in vivo needs some clarification with additional experiments, but the study seems promising. Also, interpretation of graphs should be updated as well since some distribution of replicates may be impacting statistical significance that can alter interpretation/outcomes.

      Advance: Again, MuSC heterogeneity has been an area of intense investigation for many years. The advancement would be mechanistic/functional.

      Audience: Specialized in skeletal muscle. There is potential for the CD90 fractionation to extend to other cell and tissue types, but this extent is unknown until this work is expanded.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Kheir and colleagues found that CD90 expression levels could be utilized to divide MuSCs into two populations. The authors demonstrated that CD90+ve MuSC became activated state faster than CD90-ve cells. Mechanistically, AMPK is more activated in CD90+ve cells in response to niche loss. To suppress the high responsiveness to activation signalings, CD90+ ve cells highly express ColVI and CALCR than CD90-ve cells. The authors carefully examined the heterogeneous expression of CD90 in MuSCs. Overall, however, the differences in the characteristics of CD90+ve and CD90-ve cells are small. In addition, most of the data were based on fluorescent intensity. This reviewer does not feel that this study will have a significant impact on our understanding of MuSC biology.

      Major comments:

      1. Data demonstrated the statistical differences in MuSC behaviors between CD90+ve and CD90-ve cells. However, the difference is small. For example, it is unclear whether the minimal difference in CALCR expression level between CD90+ve and CD90-ve cells gives rise to any biological difference.
      2. Negative controls of FACS analyses are required because different sizes of cells might exert different background intensities. (Figure 2I, 2L, and 6F)
      3. If CD90+ve MuSCs express Col6 higher than CD90-ve MuSCs, they should also highly express the primary target of Notch target genes, Hes1, Hey1, and HeyL. The authors should examine the expression levels of these genes.
      4. As described above, the quantifications of many results, including MyoD, were based on the fluorescent intensity. I know the difficulty of preparing enough cells for experiments, but the authors need to present data supporting these results.
      5. Figure 7G-H; More quantitative analyses should be included. In addition, the sample number was different between Fig7E and H. There is no significant difference in the CD90 expression in Fig7G. The authors need to confirm the reproductivity.

      Minor comments:

      1. Figure S4. The authors need to show evidence that these cells are proliferating. Without the evidence, CD90 expression my just be retained in non-dividing cells. If it is difficult, the results should be removed.
      2. Heterogeneity in cell cycle progression in MuSCs is well documented as fast and slow dividing cells. This reviewer recommends discussing the relevance of CD90 expression to these reports. PMID: 22349695 PMID: 8608871

      Significance

      The heterogeneity of muscle stem cells is of great interest to muscle stem cell biologists, including this reviewer. The orchestrated expression and regulation of activation and quiescence pathways is conceptually new. Several molecules are heterogeneously expressed in muscle stem cells, but the expression pattern of CD90 does not correlate with them. However, as noted above, the difference between CD90-positive and CD90-negative cells is relatively small.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Programmed cell death is prominent in developing nervous systems across evolution, but its function remains obscure. Recent work suggests that it might impact behavior, but an examination of its effects on behavior and underlying neuronal circuits in intact organisms has not been determined. In this manuscript we report that programmed cell death sculpts the developing nervous system and shapes innate behavior. Using synaptic labeling, in vivo calcium imaging, targeted rescue of programmed cell death, and automated high-resolution analysis of cell death mutants, we find that loss of programmed cell death alters animal behavior. These findings reveal that neuronal cell death during development provides a reservoir of fates and circuit connections that could be accessed on evolutionary time scales to modify innate behavioral programs. Our manuscript thus answers one of the major outstanding questions in developmental neuroscience—why programmed cell death is so prevalent—by identifying consequences for brain function at the subcellular, cellular, circuit, and behavioral level. This study will be of interest to those interested in evolution of the nervous system and behavior, developmental biology, and neural circuit development.

      We thank the reviewers for their careful attention to the manuscript. Both reviewers were enthusiastic about the work. Here we address their suggestions. As noted below, we have already addressed most of their points, and we discuss in detail the remaining point—whether it is possible to perform experiments for a more specific targeting of the undead RIM cell death event to provide additional evidence for its role in altering reversal behavior.

      2. Description of the planned revisions

      *Reviewer 1: “1. The argument that that differences in reversal behavior are likely attributable to the difference in RIM neuron numbers in the ced-3 rescue studies is very plausible. Nonethless, there remains the possibility that for some reason in animals with 4 RIMs there may be a more global effect on the fate of cells slated to die, unrelated to the number of RIMs. I think there are two ways to test this. (1) quantify the behavior in 2- vs. 4- RIM neurons in animals also containing a marker for other undead neurons, and see if there is any correlation between 4 RIMs and survival of unrelated neurons (but preferably reasonably closely related by lineage- in case that's the issue). (2) Since the authors are able to distinguish the undead cells, can they perform laser ablations on these cells and assess whether behavior is restored to normal values?” *

      • *We agree that this point is already very plausible. We also appreciate the reviewer’s suggestions on how to extend this conclusion.

      Regarding suggestion (1): Unfortunately there is not a reliable marker for undead neurons (although a current project in the lab is indeed to develop one). However, we note that the undead RIM sister cells adopt a RIM neuron fate in 96% of ced-3 mutants, while with other undead cells investigated neuron fate adoption ranged from 59% (ASEL) to 77% (ASER). This suggests that the undead RIM fate adoption is not strongly correlated with the fates of other undead cells.

      Regarding suggestion (2): We attempted to perform laser ablation of undead RIM neurons in ced-3 mutants, but we could not overcome the technical hurdles (despite our lab’s expertise in laser axotomy). We found that we could not reliably remove both undead RIMs without damaging the wildtype RIM that is in close proximity, especially in the quantities of animals necessary for behavioral experiments.

      As an alternative, we plan to perform more targeted experiments to manipulate cell death in the undead RIM to address the points raised by both reviewers. Our goal is to generate two strains. In one, programmed cell death is prevented specifically in the RIM neurons in wild type animals. We hope to achieve this by either transgenic expression of a gain-of-function mutation of ced-9, or else by RIM-specific RNAi against egl-1, ced-3 or ced-4. To do this we will use the RIM promoter tdc-1, which is confined to RIM and RIC. The second strain will allow cell death to occur only in RIM (and RIC) in animals that otherwise have no cell death. Here, we will drive wild-type ced-3 or ced-4 under the tdc-1 promoter in the corresponding mutant background.

      We note 2 caveats for both of these approaches: 1) RIC also has an undead sister; 2) Most probably, the tdc-1 promoter will not be active in time to block cell death. Caveat #2 is actually the reason why we did not do these experiments initially (instead we used the most specific promoter we could find that is expressed early in the RIM lineage, before RIM is born).

      However, we agree that if successful these experiments would complement the existing experiments, and we will build all these strains.

      Reviewer 2: “Mosaic rescue of RIM via stochastic loss of a rescue array helped demonstrate the contribution RIMu have to the locomotor phenotype. As the authors emphasise these animals have many other undead cells (outside of the reverse network). A conditional rescue of only the RIMu would greatly improve the strength of the claims made. Would a conditional RIM egl-1 knockdown (via RNAi) be possible to selectively inhibit apoptosis in those neurons. This experiment should be considered OPTIONAL. It may be that such specific promoters do not allow for egl-1 RNAi to function at the right time to rescue death.”

      • *We appreciate the reviewer’s suggestion. As stated above, we are working to perform an expanded version of these exact experiments, as well as their converse. However, as the reviewer notes, it is very possible that the timing of expression will prevent these approaches from working (Caveat #2 above).

      Reviewer 2: There is a slight issue with interpretation of the data with the mosaic GLR-1::tagRFP Fig 2M which reveals the postsynaptic compartment of one RIM even though there are two present. There seems to be no obvious apposition between pre/post and they somewhat seem to be floating in space. Why is this the case? One would have imagined that the structures in Fig 2L would be tiled composites of both AIB & RIM pre and postsynaptic elements coalescing. Can the authors provide an alternative explanation for this phenotype. Nevertheless, the data on Fig 2L seems solid.. that is animals with extra undead RIM cells have additional cell-type specific synaptic terminals

      We have selected a different micrograph that is more representative of the RIM post-synapses in ced-3 mutants. In this animal, the array labeling the post-synapses in RIM has been lost from one of the two RIM neurons, making it easier to discern that the post-synapses are apposed to the AIM pre-synaptic marker (Fig 1M).

      Reviewer 2: Clarity should be improved around the use of 'expected number' in figure 1. The description of the metric 'The 'expected number' is defined as the number of neurons of the type present in wild-type animals, plus the number of lineage-proximate undead cells.' suggests that expected (blue) regions of pie charts represents lineages with expected sum total of wt and extra undead cells. However, in reference to panel H 'The wild-type animal has two RIM neurons, and the ced-3(n717) animal has two additional RIMlike cells and is counted as contributing to the orange "more than expected" sector in panel (A)' it is said that the animals with 2 WT accompanied by each undead sister contributes to more than expected (orange) region. These appear inconsistent. Can you qualify?

      We thank the reviewer for this point and have added a schematic to clarify the quantification of undead cell fates (Fig. 1).

      Reviewer 2: Specific observations shown in supplemental data SI-L despite being cited in the text is not explained or formally referenced. The details of these panels should either be briefly explained/their inclusion qualified in the text or simply remove from the figure

      We have added reference to these figures in the main text “Undead cells are even capable of producing complex morphology, such as the highly branched dendrites of the PVD neurons (Figure S1I-L).” (p. 3)

      Reviewer 2: The dual image photomicrographs could be in green/ magenta or red/cyan to make colourblind friendly.

      We have updated micrograph colors to be colorblind friendly (Fig 1K-M, S1L).

      Reviewer 2: Do the authors have data with the pRIMtagRFP egl-nucGFP. If they do it would be useful to show it.

      We have added a micrograph of egl-1::GFP and RIM labeled using NeuroPAL (Fig. S2A).

      Reviewer 1: 2. The authors speculate, if I understand correctly, that the mechanism by which reversal frequencies are decreased in 4 RIM animals may be that the reversal state is stabilized, resulting in longer reversals and consequently fewer reversal events. This is a nice model that is testable. The authors could, for example, examine the connections of RIM neurons to the AVA neuron, a main command interneuron for reversal initiation, and assess whether there are indeed more such synapses. Furthermore, the authors can assess whether the frequency of AVA firing is decreased. Of course, there are other plausible mechanisms involving connectivity of other neurons onto AVA which could explain the phenomenon. The authors may wish to add a comment regarding this in the discussion.

      • *

      We thank the reviewer for this suggestion. There are multiple postsynaptic receptors expressed in AVA for RIM neurotransmitters and the contribution of each to reversal behavior is still being debated, making it challenging to dissect the contribution of each of these to the effects on reversal behavior mediated by the undead RIM. Given this, we believe that addressing this point experimentally is beyond the scope of this paper. We have added a sentence in the discussion commenting on this as a future direction for this work “The mechanism of the downstream circuit mediating the effects of the undead RIM could be determined through quantification of AVA postsynaptic receptors and examining reversal behavior of cell death mutants with knockouts of AVA receptors.”

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The blockade of programmed cell death (PCD) results in the generation of supernumerary neurons from the rescue of normally dying progeny within a lineage. Many of these are the product of a lineage branch that produce a single surviving neuronal daughter.

      The authors show that for many such divisions the rescued sibling expresses fate markers associated with their normally surviving sibling. The authors use the RIM neurons and their role in reversal behaviour as a model. By generating supernumerary RIM interneurons, they show that these undead neurons correlate with an increase in synaptic terminals, with their normal presynaptic partners (AIBs), and that these same supernumerary neurons have an activity pattern preceding reversal events.

      Using mosaic experiments, based around the stochastic differences in the segregation of an extrachromosomal array, the authors are able to demonstrate that a strong component to alterations in locomotor behaviour is likely due to the supernumerary undead RIMu neurons.

      To address the impact of blockade of PCD more broadly the authors assayed 4 independent PCD blocking alleles (2x ced-3, 2x ced-4) and show that the rescue of many neurons alters normal turning dynamics associated with foraging and free feeding behavioural states.

      Interestingly, subtle differences in the behaviours of ced-3 mutant animals and ced-4 mutant animals may also hint at the broader significance of these genes beyond simply controlling the on/off PCD in these lineages. One possible issue with work published in this paradigm is that by generating undead neurons, by removing caspase activity, the intervention may also impact non-apoptotic caspase function.

      The study presented does not pose any issues relating to reproducibility or associated statistical analysis.

      Major comments:

      • Mosaic rescue of RIM via stochastic loss of a rescue array helped demonstrate the contribution RIMu have to the locomotor phenotype. As the authors emphasise these animals have many other undead cells (outside of the reverse network). A conditional rescue of only the RIMu would greatly improve the strength of the claims made. Would a conditional RIM egl-1 knockdown (via RNAi) be possible to selectively inhibit apoptosis in those neurons. This experiment should be considered OPTIONAL. It may be that such specific promoters do not allow for egl-1 RNAi to function at the right time to rescue death.
      • The changes in the RIM/AIB synaptic organisation and the correlation of observed RIM/RIMu activity coincident with reverse locomotor bouts strongly supports the assertion that these two features are causally linked.
      • One interpretation is that when the additional undead RIM neuron is present the cell-type specific connectivities are recapitulated i.e. many of the pre and post-synaptic elements of RIM/AIB look apposed see Fig 2L.
      • There is a slight issue with interpretation of the data with the mosaic GLR-1::tagRFP Fig 2M which reveals the postsynaptic compartment of one RIM even though there are two present. There seems to be no obvious apposition between pre/post and they somewhat seem to be floating in space. Why is this the case? One would have imagined that the structures in Fig 2L would be tiled composites of both AIB & RIM pre and postsynaptic elements coalescing. Can the authors provide an alternative explanation for this phenotype. Nevertheless, the data on Fig 2L seems solid.. that is animals with extra undead RIM cells have additional cell-type specific synaptic terminals.

      Minor comments:

      • Clarity should be improved around the use of 'expected number' in figure 1. The description of the metric 'The 'expected number' is defined as the number of neurons of the type present in wild-type animals, plus the number of lineage-proximate undead cells.' suggests that expected (blue) regions of pie charts represents lineages with expected sum total of wt and extra undead cells. However, in reference to panel H 'The wild-type animal has two RIM neurons, and the ced-3(n717) animal has two additional RIMlike cells and is counted as contributing to the orange "more than expected" sector in panel (A)' it is said that the animals with 2 WT accompanied by each undead sister contributes to more than expected (orange) region. These appear inconsistent. Can you qualify?
      • Specific observations shown in supplemental data SI-L despite being cited in the text is not explained or formally referenced. The details of these panels should either be briefly explained/their inclusion qualified in the text or simply remove from the figure
      • The dual image photomicrographs could be in green/ magenta or red/cyan to make colourblind friendly.
      • Do the authors have data with the pRIMtagRFP egl-nucGFP. If they do it would be useful to show it.

      Significance

      The work presented here complements similar studies performed on insects, illustrating that this biological motif holds true outside of that class. Showing this is a more general feature within other taxa broadens the appeal and will be of great interest to those in developmental biology, neural circuit development/function and evolutionary biology.

      Although the work does not illustrate a completely novel finding, it is rigorous and well-conceived, adding support to previous studies in the field and is an important jumping off point for future studies. The authors present compelling independent support that undead developmentally 'doomed' neurons retain the ability to differentiate, show molecular hallmarks of sibling fate, can integrate within networks and function.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an interesting paper describing the neural circuit and behavioral consequences of blocking programmed cell death in C. elegans with a mutation in the apoptotic caspase ced-3. The authors survey several reporters for neurons whose sister cells normally die, and observe that undead cells can express markers of the normally living sister cell. They also show that ced-3 mutants exhibit a large variety of behavioral defects. They interrogate the effects of inappropriate survival of two RIM sister neurons in animals carrying a complement of 4 instead of 2 RIMs. The authors demonstrate that the presynaptic neuron AIB makes synapses onto the undead RIM neurons and show GCaMP activity in these neurons that correlates with reversal behavior, a normal function of RIM neurons. Animals with 4 RIM neurons have a reduced number of reversal events compared to wild type animals, suggesting circuit defects. Importantly, restoring ced-3 expression only in the RIM lineage partially restores reversal behavior frequency. The authors conclude that undead neurons interfere with the normal function of the nervous system by making aberrant connections that interfere with circuit activity.

      This paper is beautifully written. The logic is crystal clear, and the experiments are appropriate and rigorously executed. The conclusions are generally appropriate. I have a couple of comments that may be useful for the authors to consider:

      1. The argument that that differences in reversal behavior are likely attributable to the difference in RIM neuron numbers in the ced-3 rescue studies is very plausible. Nonethless, there remains the possibility that for some reason in animals with 4 RIMs there may be a more global effect on the fate of cells slated to die, unrelated to the number of RIMs. I think there are two ways to test this. (1) quantify the behavior in 2- vs. 4- RIM neurons in animals also containing a marker for other undead neurons, and see if there is any correlation between 4 RIMs and survival of unrelated neurons (but preferably reasonably closely related by lineage- in case that's the issue). (2) Since the authors are able to distinguish the undead cells, can they perform laser ablations on these cells and assess whether behavior is restored to normal values?
      2. The authors speculate, if I understand correctly, that the mechanism by which reversal frequencies are decreased in 4 RIM animals may be that the reversal state is stabilized, resulting in longer reversals and consequently fewer reversal events. This is a nice model that is testable. The authors could, for example, examine the connections of RIM neurons to the AVA neuron, a main command interneuron for reversal initiation, and assess whether there are indeed more such synapses. Furthermore, the authors can assess whether the frequency of AVA firing is decreased. Of course, there are other plausible mechanisms involving connectivity of other neurons onto AVA which could explain the phenomenon. The authors may wish to add a comment regarding this in the discussion.

      Significance

      This is an interesting paper describing the neural circuit and behavioral consequences of blocking programmed cell death in C. elegans with a mutation in the apoptotic caspase ced-3. The authors survey several reporters for neurons whose sister cells normally die, and observe that undead cells can express markers of the normally living sister cell. They also show that ced-3 mutants exhibit a large variety of behavioral defects. They interrogate the effects of inappropriate survival of two RIM sister neurons in animals carrying a complement of 4 instead of 2 RIMs. The authors demonstrate that the presynaptic neuron AIB makes synapses onto the undead RIM neurons and show GCaMP activity in these neurons that correlates with reversal behavior, a normal function of RIM neurons. Animals with 4 RIM neurons have a reduced number of reversal events compared to wild type animals, suggesting circuit defects. Importantly, restoring ced-3 expression only in the RIM lineage partially restores reversal behavior frequency. The authors conclude that undead neurons interfere with the normal function of the nervous system by making aberrant connections that interfere with circuit activity.

      The paper should be of interest to researchers studying neural circuit assembly and function, programmed cell death, behavior, and evolution of behavior and the nervous system.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statement We thank all three reviewers for their careful and constructive evaluation of our manuscript. We are pleased that the reviewers recognised the importance of the work we describe and found the experimental approach sound.

      This manuscript reports that undesired insertion of the plasmid backbone, including vector sequences not intended to be part of the genome edit, occurs at high frequency during CRISPR/Cas9-mediated HDR in Drosophila. We document this phenomenon across multiple independent genome editing projects, using three different plasmid backbones and targeting distinct genomic loci, demonstrating that it is not an isolated or project-specific artefact. We further introduce pVID, a new donor vector incorporating a ZsGreen negative selection marker that allows straightforward identification and exclusion of lines carrying undesired insertions, providing a practical solution to avoid this genome editing issue.

      In response to the reviewers' comments, we have revised the manuscript to: (i) correct and contextualise prior descriptions of this problem, incorporating the references suggested by Reviewer 2; (ii) add a table summarising gRNA characteristics for all editing projects; (iii) expand the discussion of the underlying DNA repair mechanisms, the potential influence of Cas9 source choice, and the relevance of the findings beyond Drosophila; (iv) confirm the stability of problematic template vector insertions across multiple generations; and (v) improve figure clarity, correct typographical errors, and clarify several passages flagged by the reviewers. All responses are described in detail below.

      1. Point-by-Point Description of the Revisions

        Reviewer 1

        Major Comment 1 — DNA repair pathways underlying backbone capture • I think the authors should discuss potential DNA repair pathways (e.g., NHEJ, MMEJ) underlying plasmid backbone capture in more detail. Did you check for knockouts within your screened transformants? That could provide insight into the underlying mechanisms.

      Response: We screened humanized TDP-43 line for tbph knockouts, since our aim was to fully knock out the Drosophila gene and insert the human ortholog. However, we did not screen any of the other lines described in the manuscript for indels caused by NHEJ, since the dsRed selection we employed would not enable us to recover lines without insertion events. We hypothesise that one of the two gRNAs used being more inefficient than the other causes a single homologous recombination event and insertion of the vector template. However, the underlying mechanism is still unclear, and could be caused by NHEJ, HDR or a combination of these mechanisms as has previously observed (44). We have expanded on potential mechanisms inducing HDR template vector insertion events in the discussion of the revised manuscript.

      Major Comment 2 — gRNA characteristics and design parameters • It would be important to describe gRNA characteristics and general design parameters (GC content, distance from cut to intended edit, homology arm length) and analyze whether these correlate with correct HDR vs. plasmid insertion. A table summarizing these details could help reveal potential trends.

      Response: At the reviewers suggestion, we have added a table (Table 1) describing the all the characteristics of the gRNAs further in the material and method section. Unfortunately though, no commonality was immediately apparent to us.

      Major Comment 3 — Single versus dual gRNA strategies • Did the authors consider exploring whether using a single gRNA reduces backbone insertion frequency compared to dual-gRNA strategies? I understand that two gRNAs are needed for your strategy, but it would be interesting to know whether these outcomes are linked to the dual-gRNA design.

      Response: As stated in the discussion, we theorize that perhaps one of the two gRNAs used in our strategies cuts more efficiently and thereby causes a single homologous recombination event and insertion of the vector template. It is possible that originally using a strategy with only one gRNA could cause less insertion of the vector template, however this may be at the cost of gene editing efficiency. Indeed, when Ge et al (17) compared using one versus two gRNAs to induce HDR, they observed more reliable repair events when two gRNAs were used.

      Major Comment 4 — Stability of backbone insertions across generations • Did you evaluate whether backbone insertions are stable across generations or prone to rearrangement?

      Response: We did keep several of the lines reported in this paper stably across multiple generations, and we have added this observation to the manuscript

      Major Comment 5 — Broader applicability in non-model organisms and therapeutic settings • A broader discussion of the potential applications of this approach in non-model insects, mammalian cells, or therapeutic settings where HDR is inefficient would be valuable.

      Response: While we only investigated this effect in the creation of CRISPR/Cas9 Drosophila melanogaster models, it is very possible that this could also affect other model organisms or cells. We encourage the use of HDR template negative selection markers in all uses of HDR-mediated CRISPR/Cas9 genome editing.

      Major Comment 6 — Cas9 promoter and expression level • The authors also mentioned using a validated Cas9 line (ref #23). What promoter drives Cas9 expression in this line? Did you consider testing different promoters? Since timing of Cas9 expression can be critical, promoter choice may have influenced the results and should be discussed.

      Response: We used the nos promoter for the expression of Cas9, as this promoter is expressed in germ cells and is known to have better efficiency than the other germline promotor like vasa (Port et al 2014, Ref #23). However, it is conceivable that the high Cas9 concentration in this line could induce a higher rate of double stranded breaks and thus template vector insertion. We agree it would be interesting to test other Cas9 sources, though this would likely come at the cost of overall editing efficiency. As we describe, the use of pVID now allows negative selection against HDR template vector insertion even with this Cas9 source. We have expanded upon the potential use of other Cas9 sources in the revised discussion.

      Reviewer 2

      Major comments

      None

      Minor Comment 1 — Line 38: prior descriptions of backbone insertion in Drosophila Line 38: "this type of unwanted template vector insertion in the case of Drosophila genome editing has to our knowledge not been previously described." Insertion of vector sequences after CRISPR editing in Drosophila and strategies to mitigate such events have been previously described in multiple studies. The authors need to incorporate these into their manuscript. https://doi.org/10.1242/bio.20147682, https://doi.org/10.1080/19336934.2020.1832416, https://doi.org/10.1534/g3.116.032557.

      Response: We are very grateful to the reviewer for pointing out these prior observations of vector insertion events of which we were not aware. This prior work has now been fully incorporated and referenced in the revised manuscript, and we have removed this erroneous statement. We feel this manuscript validates and quantifies the extent of HDR template insertion across multiple genome editing strategies and templates plus, with pVID, provides a solution to this vexing problem.

      Minor Comment 2 — Line 79: PAM sequence sentence I have difficulties understanding the following sentence: Line 79: "At this location, on both sides of the insertion, the PAM sequence of the target region was edited to match the PAM sequence of the template donor plasmid." I assume what is meant here is that in the donor vector the PAM sequence was mutated to prevent recutting, but that means this sequence is no longer a PAM. Please rephrase for added clarity.

      Response: The PAM sequence was indeed edited in the template donor plasmid to prevent re-cutting, and we are referring to this edited version of the PAM sequence in this sentence. We edited this sentence this to clarify that the PAM sequences have been edited.

      Minor Comment 3 — Figure 2: panel D arrangement In Figure 2 panel D is arranged between panels E and F.

      Response: Thank you for pointing this out. We have corrected this error.

      Minor Comment 4 — Primer positions in figures In Figure 2 it would be useful to also indicate the position of the primers used in 2d in the schematic in 2e. The same applies to Fig. 3a and 4a.

      Response: We have added the position of the primers in figure 2. Since the primers are targeting the backbone of the plasmid commonly in all projects included in this manuscript, we have chosen to only include one figure of this (figure 2).

      Minor Comment 5 — Lines 89–90: duplicated sentence Lines 89, 90: Duplication of the same sentence.

      Response: Thank you, we have corrected this mistake.

      Minor Comment 6 — VGAT editing: consecutive editing and sgRNA placement Editing of the VGAT gene: In this case correct editing and plasmid insertions could be found on the same chromosomes. This might be caused by concatemer formation of repair intermediates (as has been described in multiple systems) or by consecutive editing events. Can you please specify whether the donor vector was designed to prevent consecutive editing? I'm also a bit confused about the locations of the sgRNA target sites according to Fig. 3a. It appears that part of the insertion (i.e. the ALFA tag) was encoded on the homology arm and not between the target sites. While such strategies have been described, they are often avoided as the efficiency of insertion decreases with increasing distance to the cut site. Was it not possible to us a sgRNA better matching the insertion cassette?

      Response: For Vgat genome editing, we followed an existing strategy that has been proven effective, reusing the same gRNAs and overall approach to replace the 9×V5 tag with a 1×ALFA tag (Certel et al. 2022, Ref #28)

      Minor Comment 7 — Line 133: mini-white marker unreliability Line 133: Please describe why the mini-white marker was unreliable.

      Response: In our first design of the pVID vector, we used mini-white as the negative selection marker. However in a number of white eyed lines, we could still confirm the undesired insertion of the HDR template vector. We speculate that expression of mini-white (which we confirmed was not mutated) was repressed in these lines by an unknown mechanism. Since (Nyberg et al. 2020 , Ref #35) also proposed using mini-white as a negative vector selection marker, we wanted to mention this problem with mini-white negative selection, though we remain unsure of the exact cause. In any case, the use of exogenous ZsGreen in pVID as described in the manuscript fully resolved the issue allowing reliable detection of template vector insertion events as we describe.

      Minor Comment 8 — Line 161: "varying frequency" Not sure I understand the sentence in line 161: If 54% of lines had vector insertion, what does the "varying frequency" refer to?

      Response: We have edited this sentence to clarify that 54% of lines had vector insertion.

      Minor Comment 9 — pVID availability in methods Consider highlighting the availability of pVID also in the methods section that described this plasmid.

      Response: This has been added to the methods section.

      Reviewer 3 No edits suggested.

      We thank Reviewer 3 for their positive assessment of the manuscript and for confirming that no revisions are required.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Highly frequent undesired insertional mutagenesis during Drosophila genome editing by Kallstig et al. revolves around Homology-Directed Repair (HDR) and the surprisingly high frequency of plasmid backbone insertions into the genome.

      In brief, the authors describe three independent experiments in which the intended homology regions were inserted together with plasmid backbone sequences into the Drosophila genome. Each experiment was designed with a slightly different setup: the first aimed to generate a humanized version of the TAR DNA-binding protein 43 (hTDP-43), while the second introduced an alpha tag into the Vesicular GABA transporter (VGAT) gene. In the first experiment, the pCR4 vector served as the backbone, whereas the second experiment relied on the pHSG298 vector. Both experiments resulted in relatively high frequencies of incorrectly edited genomes - 18% and even 66%, respectively. The authors hypothesized that the rate of undesired events could be even higher if the targeted gene is non-essential. To test this, the third experiment focused on mutagenesis of the Glutamate Receptor IIA (GluRIIA) gene, which is homozygous viable even in protein-null mutants. Indeed, the frequency of incorrect edits was approximately 11:1 (more than 90%). These findings suggest that plasmid backbone insertion is a common and important issue in HDR-based genome editing in Drosophila.

      To address this problem, the authors designed a new vector. While the classical eye color marker (e.g., dsRED) serves for positive identification of HDR recombination, a second fluorescent marker (ZsGreen), encoded in the plasmid backbone and also expressed in the compound eye, enables clear detection of undesired plasmid backbone insertions.

      The study is clearly written, and the plasmids are sufficiently well described in the figures. The reproducibility is somewhat limited by the use of different plasmids in combination with different target genes. Nevertheless, the number of analyzed insertions was high enough to convincingly illustrate the issue.

      Significance

      I find this manuscript to be a valuable description of an existing problem, together with a potentially efficient method for detecting undesired plasmid insertions. From an experimental perspective, I consider the comparison of three different vector backbones combined with different target genes to be rather difficult. On the other hand, as an experimental biologist, I completely understand the logic and the history of the problem-solving process. Undesired insertions were identified by different approaches (PCR and sequencing), and the authors clearly kept this issue in mind. When the problem persisted in the second experiment, and was even more pronounced in the third experiment (involving a non-lethal gene), they developed a vector that makes the screening process more efficient. Altogether this is a valuable technical study worth of reporting.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary In this manuscript Källstig, Ruchti, McCabe and colleagues report frequent undesired editing outcomes after CRISPR gene knock-ins in Drosophila. Using Cas9 for the targeted induction of DNA double strand breaks and plasmids with long homology arms as donor molecules, they find that the whole plasmid inserts with high frequency at multiple loci. To detect such events they generate a plasmid with a dominant marker encoded on the plasmid backbone, which can be used to enrich for correct insertions by negative selection.

      Major comments

      Minor comments

      Line 38: "this type of unwanted template vector insertion in the case of Drosophila genome editing has to our knowledge not been previously described." Insertion of vector sequences after CRISPR editing in Drosophila and strategies to mitigate such events have been previously described in multiple studies: https://doi.org/10.1242/bio.20147682, https://doi.org/10.1080/19336934.2020.1832416, https://doi.org/10.1534/g3.116.032557. The authors need to incorporate these into their manuscript.

      I have difficulties understanding the following sentence: Line 79: "At this location, on both sides of the insertion, the PAM sequence of the target region was edited to match the PAM sequence of the template donor plasmid." I assume what is meant here is that in the donor vector the PAM sequence was mutated to prevent recutting, but that means this sequence is no longer a PAM. Please rephrase for added clarity.

      In Figure 2 panel D is arranged between panels E and F.

      In Figure 2 it would be useful to also indicate the position of the primers used in 2d in the schematic in 2e. The same applies to Fig. 3a and 4a.

      Lines 89, 90: Duplication of the same sentence.

      Editing of the VGAT gene: In this case correct editing and plasmid insertions could be found on the same chromosomes. This might be caused by concatemer formation of repair intermediates (as has been described in multiple systems) or by consecutive editing events. Can you please specify whether the donor vector was designed to prevent consecutive editing? I'm also a bit confused about the locations of the sgRNA target sites according to Fig. 3a. It appears that part of the insertion (i.e. the ALFA tag) was encoded on the homology arm and not between the target sites. While such strategies have been described, they are often avoided as the efficiency of insertion decreases with increasing distance to the cut site. Was it not possible to us a sgRNA better matching the insertion cassette?

      Line 133: Please describe why the mini-white marker was unreliable.

      Not sure I understand the sentence in line 161: If 54% of lines had vector insertion, what does the "varying frequency" refer to?

      Consider highlighting the availability of pVID also in the methods section that described this plasmid.

      Significance

      This manuscript describes vector backbone insertions as a frequent complication of CRISPR knock-in experiments in Drosophila and introduces a cloning vector with a selectable marker on the plasmid backbone that allows counter selection of such undesired events. The manuscript is very well written and the experiments are overall well designed.

      Insertion of vector sequences during homologous recombination (often referred to as "ends-in" recombination events) has been described on multiple occasions in a wide variety of model systems. Also in Drosophila, the system used here, such events have been described by multiple groups (see comments above). Furthermore, plasmids designed to allow to counter select for such events have also been described previously (e.g. Addgene plasmids 157991, 80801).

      In summary, this manuscript highlights once more an important complication in genome engineering experiments, but does not significantly advance the knowledge in the field beyond the existing literature and the described plasmid is largely redundant with preexisting plasmids designed for the same purpose. While this overall severely limits the significance of this work, it does provide important replication of previous work.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      CRISPR/Cas9 genome editing has improved the ability to introduce precise genetic modifications in multiple organisms such as Drosophila melanogaster. By coupling Cas9-induced double-strand breaks with homology-directed repair (HDR), researchers can replace, insert, or delete genomic sequences with high specificity.

      In this work, the authors explore significant concerns about the fidelity and outcomes of HDR-based editing. They identify a recurring issue since unintended insertions of the entire donor template vector into the genome was observed. These undesired events are observed across multiple genes, indicating that the problem is not locus- or construct-specific. These insertions can occur at high frequencies, complicating efforts to establish accurate transgenic lines. They not only mask intended edits but may also introduce unpredictable phenotypes unrelated to the desired genetic modification.

      The authors addressed the problem of frequent donor plasmid insertions during CRISPR/Cas9 HDR in Drosophila by redesigning their HDR template vectors. They incorporated a GFP marker into the plasmid backbone alongside a DsRed cassette. This design allowed them to distinguish correct HDR events, which carried only DsRed, from aberrant plasmid integrations, which carried both DsRed and GFP. By screening flies for marker expression, they could rapidly identify and exclude incorrect insertions.

      Please, see below my comments:

      • I think the authors should discuss potential DNA repair pathways (e.g., NHEJ, MMEJ) underlying plasmid backbone capture in more detail. Did you check for knockouts within your screened transformants? That could provide insight into the underlying mechanisms.
      • It would be important to describe gRNA characteristics and general design parameters (GC content, distance from cut to intended edit, homology arm length) and analyze whether these correlate with correct HDR vs. plasmid insertion. A table summarizing these details could help reveal potential trends.
      • Did the authors consider exploring whether using a single gRNA reduces backbone insertion frequency compared to dual-gRNA strategies? I understand that two gRNAs are needed for your strategy, but it would be interesting to know whether these outcomes are linked to the dual-gRNA design.
      • Did you evaluate whether backbone insertions are stable across generations or prone to rearrangement?
      • A broader discussion of the potential applications of this approach in non-model insects, mammalian cells, or therapeutic settings where HDR is inefficient would be valuable.
      • The authors also mentioned using a validated Cas9 line (ref #23). What promoter drives Cas9 expression in this line? Did you consider testing different promoters? Since timing of Cas9 expression can be critical, promoter choice may have influenced the results and should be discussed.

      Significance

      This paper will appeal primarily to researchers in the fields of functional genomics, insect genetics, and genome engineering, particularly those working with Drosophila or other model organisms where CRISPR/Cas9 is widely used. It is also of interest to scientists engaged in vector biology, agricultural pest control, and translational applications of genome editing, as the findings touch on broader issues of editing accuracy and unintended repair outcomes.

      The main advance of the study is the clear demonstration that unintended donor plasmid backbone insertions are not rare artifacts, but frequent and systematic events during CRISPR/Cas9-mediated HDR in Drosophila. By integrating a GFP marker into the plasmid backbone alongside the intended DsRed marker, the authors provide a straightforward and practical method to identify, separate, and exclude these erroneous events. This approach both highlights the hidden pitfalls of HDR-based editing and offers an effective solution, thereby improving the reliability of CRISPR applications. Beyond Drosophila, the work advances the field by underscoring the need for careful design and validation of donor constructs, with potential implications for genome editing strategies in other organisms where HDR efficiency and fidelity remain key challenges.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1 1. The code used for simulations is available on a public repository, but it does not directly ensure that results are reproducible. To do so would require a clear step-by-step guide referring the user to the specific pieces of code which have been used for the results and figures presented in the paper. At the moment, I could not find any such guide and the large number of scripts, executables and jupyter notebooks are not clearly linked to the paper's contents

      We agree that the code should be as accessible as possible for reproducing the results. We have updated the public repository (linked given in the 'data and code availability' section of the manuscript, lines 350-352) to include the SLURM job scripts used to run the evolutionary simulations and analyses, together with an overview of which scripts and notebooks were used for creating the figures.

      2. The methods themselves involve a number of arbitrary choices. Though this is understandable given the nature of the work, one aspect in particular that would deserve better clarity is the modeling of gene network dynamics. The stochastic model (l.516 & following) involves a nesting of "Hill-like" terms (those in Eqs. (7) and (11)) which is unusual and given without justification. There should be some explanation of how this approach relates to standard approaches such as those reviewed e.g. in: Bintu et al. Current opinion in genetics & development 15.2 (2005): 116-124.

      We agree that the formulation of the developmental model requires clearer justification and contextualisation. We have added a citation to situate our implementation within existing modelling frameworks, and a brief explanation of the choice for Hill equations in the Methods section (lines 577-579).

      1. It is also unclear at the moment how exactly the GRN dynamics is used; are time-stepping algorithms used until the system reaches a stationary regime? If so, how is stationarity assessed? This needs to be explained both in the main text and in the methods. The table of parameters suggests that there was a cut-off time, but there is no explanation whatsoever about the state of the dynamics at this time.

      We have revised the main text to briefly explain how the developmental dynamics are implemented (lines 88-90) and expanded the Methods section (Gene expression and regulation in the developmental model) to describe the integration procedure in detail (lines 617-620).

      The GRN dynamics are modelled as stochastic differential equations (SDEs), which are numerically integrated for a fixed developmental duration of T_D = 140 hours, regardless of whether a stationary state is reached.

      Instead, stationarity is indirectly favored by the fitness function. Fitness is calculated as the time average of the phenotype (protein states) over a window at the end of development (Equation 23 in the Methods). As a result, GRNs that exhibit large fluctuations or ongoing transient dynamics during this evaluation window tend to have lower fitness (and in turn, reproduction rate) than GRNs that have stabilised their expression patterns. We now mention this in the model introduction of the results section (lines 98-99).

      As a result of this, we observe that the vast majority of evolved GRNs reach a stable gene expression state by the end of development (aside from small fluctuations as expected from the SDEs).

      1. Related to the previous point, the table of parameters (Table S1) is provided without any explanation; through what process (exploratory, literature review, trial and error...) where the values selected? As there been any type of sensitivity analysis?

      We have clarified in the revised manuscript how each group of parameters was chosen (lines 618-620 and 744-746). In brief:

      Developmental time parameters (e.g., integration time, diffusion coefficient) were set to roughly match the developmental window of H. trionum from stage 0 to stage 2 (~150 hours; Riglet et al. 2024), during which pre-patterning is established. Molecular concentrations are expressed in arbitrary units Evolutionary parameters (e.g., mutation rates) are based on previous published work using this modeling framework and were slightly adjusted during an initial exploratory phase to ensure stable evolutionary dynamics. We have added citations for this. We have not performed a full global sensitivity analysis across all parameters. Such an analysis would be computationally expensive given the cost of running evolutionary simulations and the difficulty of assessing parameter effects in this multi-scale system. Importantly, the core GRN parameters (expression rates, interaction topology, and interaction strengths) are evolvable rather than fixed. We have conducted sensitivity analyses at the level of individual evolved GRNs, but a systematic analysis is beyond the scope of this paper.

      Minor Comments

      1. The fitness function used in simulations specifically encodes the desired pattern, with two zones having differential gene expression. This allows the artificial selection to evolve towards such patterns, as expected, but it is not entirely clear how this relates to natural selection itself. At the very start of the paper, the authors briefly review some possible sources of selective pressure for flowers to exhibit patterns such as bullseye, among others. None of the selective factors would likely act on the plants as a direct incentive for two regions, as specified in the cost function. Instead, one may expect a more high level criterion, such as "conspicuousness" for a pollinator, for instance. This is admittedly not naturally represented as a fitness function, but the choice of this function definitely influences the outcomes of a simulation. Some further numerical experiments may allow to demonstrate that the exact cost function is not critical for the findings of the paper, but I understand they would likely be computationally costly, to the point of unfeasibility. This limitation should be mentioned at least.

      We agree that natural selection acts on higher-level criteria such as pollinator attraction or conspicuousness rather than a predefined measure like "two distinct regions." However, our goal in this study is specifically to understand how the bullseye pattern in particular is produced, motivated by comparison to Hibiscus and other angiosperms where this pattern has documented adaptive relevance. The fitness function was therefore designed to ensure this particular pattern evolved, which results in evolving between-level novelty rather than constructive novelty (as defined in Colizzi et al., Essays Biochem 2022: of interest here is the evolved dynamics of development, not the resulting pattern). In this way, the fitness function serves as a proxy for selection on floral patterning. We have clarified this rationale more explicitly in the Results section (lines 97-98).

      The choice of fitness function does influence simulation outcomes. Within the scope of selecting for a bullseye pattern, we previously ran simulations where bullseye size was fixed rather than dynamic, and boundary cell types still evolved in those cases. This suggests our findings are robust across variations of the bullseye fitness function. Of course, selecting for a more abstract ecological criterion such as "conspicuousness" rather than a distinct spatial pattern would affect outcomes more substantially. However, translating such high-level criterion into a quantitative fitness function is a non-trivial challenge and outside the scope of this study. We have added a note on this point in the Methods section on the fitness function (lines 687-691).

      1. The number of genes used in the simulations is very small in comparison to real organisms. This is clearly justified by the complexity of the work, but one wonders if simulations could be made more efficient by using a much simplified approach for the gene network dynamics. At the time scales of interest, it seems that the use of SDEs and the numerical intricacies they require might be an unnecessary burden. Have the authors considered a much simpler approach, for instance based on Boolean models? Since the study only uses static tissues, all the GRN dynamics could be by-passed, determining steady states very quickly and using them to determine fitness. If this saved significant computational time, this would allow a more comprehensive survey of the "purely genetic" part of the model.

      While the number of genes may indeed be indeed small compared to real organisms, our simulations should be viewed as operating on subnetworks that form part of a much larger developmental GRN. This is a common approach in modelling the evolution of developmental processes, which we now highlight in the methods section. Furthermore, we find that the functional part of the GRN (which we identify by pruning away the redundant genes and interactions) always uses only a subset of the gene types, showing that we provide sufficient degrees of freedom for the evolutionary process to find a solution. We now also make note of this in a new figure (Figure S12) where we explain the pruning algorithm.

      We agree that simplified representations of the GRN, such as Boolean models or direct steady-state mappings, could substantially reduce computational cost. However, the use of stochastic differential equations (SDEs) in the present study is deliberate. Continuous, stochastic GRN dynamics allow us to capture key features that would be difficult or impossible to represent in Boolean or purely steady-state frameworks. In particular, they enable (i) gradual spatial distributions of morphogens, which are central to pattern formation, (ii) explicit treatment of gene expression noise, and (iii) consider and analyse the developmental dynamics in detail.

      Finally, in response to Reviewer 2's comment 1, we show all evolved networks (Figure S3 & S4) and perform a GRN motif comparison between noisy and deterministic simulations (Figure S15) to provide more information about the genetic part of the model.

      _Reviewer 2_

      1. There is a major missed opportunity to analyze the evolved networks. Only one of the 30 GRNs is analyzed in figure 4. Please add further analysis of the GRNs from all the populations. Within a population after 30K generations, how much variation is there in the GRNs of individuals? How similar are the optimal fitness evolved GRNs across all 35 populations? Are there common motifs across networks? Is there always an antagonism between proximal and distal proteins somewhere in the network? A lot of previous work on GRNs has established the function of common motifs, and these should be analyzed. Please provide all 30 gene regulatory networks in the supplement.

      We have substantially expanded the analysis of evolved networks across all populations. Specifically, we now (i) provide two supplementary figures showing the final pruned GRNs from all 35 simulations (Figures S3 & S4), and (ii) quantify motif frequencies across all evolved networks and compare motif distributions between GRNs evolved with and without molecular noise (Figure S15). This new analysis is summarised in a dedicated Results paragraph where we identify regulatory asymmetries and condition-dependent differences in feedback architecture, including changes in abundance of mutual inhibition and positive autoregulation (lines 233-239).

      We find that, while the evolved maximum fitnesses are very similar across simulations (Fig. 2Ai), the networks are highly variable. Nevertheless, the motif analysis shows some trends that differ between the noise and no-noise simulations, such as a bias towards mutual inhibition between PROX and DIST in the no-noise compared to the noise simulations.

      As to the variation within a population: we find that at any timepoint, all individuals are descended from a common ancestor that lived on average ~600 generations back, meaning that they form a single (quasi)species. We therefore analyse a single, highly fit individual at the last timepoint.

      1. The purpose and significance of examining the evolutionary lineage is not clear. Please explain your logic. This is most important for Figure 5 where it becomes clear that the boundary cells are often formed transiently in the evolution of the GRN. If this boundary cell type does not persist, how can it help the petal generate a bullseye. What happens after the boundary cell type is lost? Has the GRN evolved into a more stable place where it no longer needs the boundary? In several instances it looks like they come and go many times. Please explain how these transient boundary cells in the evolutionary lineage can make a difference. This point also comes up in lines 113-115 "For each simulation, we traced back the ancestral lineage of the final fittest individual and sampled 12 of its ancestors at evenly spaced generational intervals, performing this analysis on each sampled ancestor." I could understand if the boundary cell type were developmentally transient, but I have a hard time what its significance is since it is evolutionarily transient.

      The persistence of the boundary cell type over evolutionary time is used as a signal for its functional role in establishing the bullseye pattern. We observe that mostly two extremes occur: boundary cell types can be conserved over long evolutionary periods, or they can be highly transient. In our simulations, boundary cell types that are functionally important tend to persist, whereas the ones that are not involved in producing the bullseye pattern appear only transiently. The fact that both cases can occur suggests that boundary cell types are a "free" or easily accessible feature during the evolution of this patterning system: they can arise repeatedly without being strictly required, but may nonetheless become functionalised under certain evolutionary trajectories (see also our discussion of the Mimulus leaf stripe). We have added more explanation on the logic of examining the evolutionary lineage at the beginning of the results section related to Figure 5 (lines 205-209 and caption of Figure 5).

      To further clarify this point, we have added a supplementary figure (Figure S16) focusing on a deterministic simulation with a highly evolutionarily transient boundary cell type. By identifying the GRN mutations associated with the (re-)appearance of the boundary, we show that the patterning mechanism producing the bullseye slowly mutates while preserving the bullseye, while the mutational neighbourhood of the GRN contains diverse mutations that generate boundary cell types. In this case, boundary cells arise independently through distinct mutations rather than repeated rediscovery of a single change, explaining both their frequent appearance and their lack of long-term evolutionary stability.

      1. It is worth saying more about how the 9 lineages without a boundary cell types manage to make a robust bull's eye pattern because this is also interesting.

      This is indeed a good idea, we have carried out an analysis similar to that in Figure 4 for a GRN from a lineage without a boundary cell type and included it as a supplementary figure (Figure S11).

      4. How were 12 proteins chosen for the network, as opposed to 6 or 20 for instance? In the network pruning, it seems like fewer proteins are required. How many proteins are required to produce a bulls eye pattern?

      This choice is indeed somewhat arbitrary. We settled on 12 gene types to provide enough degrees of freedom while also keeping the evolutionary simulations computationally feasible. In practice, we find that pruned GRNs typically only use a subset of the 12 gene types, suggesting that the system has enough degrees of freedom to produce the bullseye pattern. For example, the smallest networks that evolved (after pruning) have 5 genes in the deterministic model and 7 in the noisy model.

      To clarify this choice, we now added a brief mention of these considerations to the relevant methods section (lines 641-643).

      Minor Comments

      1. The title needs to be changed to include computational modeling or simulation because otherwise the current version of the title implies that these boundary cell types are found in plant species evolution.

      We agree and have renamed the paper "Computational Model of Flower Pattern Evolution Predicts Spontaneous Emergence of Boundary Cell Types Across Petal Epidermis."

      1. Line 103 - 106 "We found that over a third of all simulations evolved a bullseye size of approximately 50% of the petal's central height (Figure 2A.ii). This indicates a tendency for simulations to converge toward these proportions, possibly due to the interaction between the patterning signal distribution and the tissue geometry." The phrasing here is confusing. Which proportions does "these proportions" refer to? Presumably, 50% from the preceding sentence. But the second proportion is not clear from the text. Maybe it is the peak at approximately 65% seen in the graph. Please clarify in the text.

      The 50% figure refers to the bin with the highest peak in Figure 2A.ii, reflecting a bias toward certain bullseye proportions rather than a uniform distribution across all possible sizes. We have rewritten the sentence to clarify this (lines 109-112): "This indicates a tendency for simulations to converge towards certain proportions more than others, possibly due to the interaction between the patterning signal distribution and the tissue geometry"

      1. Line 118 "To further explore cell identity in the third cluster, we analysed the gene expression profiles of the three identified cell types." It is not clear what the third cluster refers to. The previous sentence mentions 9 lineages without boundary cell types. So, a transition here back to lineages with boundary cell types, would help here.

      We agree and have improved the phrasing here by referencing back to the lineages with boundary cell types (lines 124-125):

      "Focusing on the majority of lineages in which this third boundary cell type arose, we analysed the gene expression profiles of the three identified cell types."

      1. Figures 3C-D, it would help to label these volcano plots proximal versus boundary and distal versus boundary. Although they do fit your color scheme and legend for the color scheme, it is important to specify it explicitly.

      We have added labels inside the volcano plots in Figure 3C-D to clarify proximal versus boundary and distal versus boundary.

      1. On Figure 4A it would help to label which gene is Prox and Dist. I assume they are the purple and yellow genes, but it would be easier if they were labeled.

      We have added labels in Figure 4A here to clarify.

      6. Line 185-186 "Gene 5 delays and spatially restricts the expression of gene 10, ensuring the symmetric development of the pattern." This statement needs to be supported by showing a time series simulation-movie or timepoints-revealing this timing aspect of Gene 5.

      We agree with the reviewer that this is currently lacking a clear visualisation and thank them for pointing this out. To address this, we have updated Figure 4 to include the temporal expression of genes 5 and 10 in the wild type and mutant for cells along the left-right axis in the proximal bullseye region. We have also included the following extra details in the results text (lines 194-199):

      ** Decreasing the spatial range of gene 5's regulatory influence by turning it into a TF resulted in a delay in its inhibition of gene 10 and reduced its self-activation range, explaining the smaller bullseye. In this mutant, expression of gene 5 is progressively delayed in cells located further from the origin of the patterning signal, and is ultimately absent on the right side of the proximal region of the bullseye (Figure 4C.ii). As a consequence, gene 10 becomes expressed in the right region, resulting in DIST identity instead of PROX, and leading to an asymmetric bullseye pattern.

      Reviewer 3

      1. How are the cell types defined from the simulations? Are they attractors of the dynamics of the corresponding proteins? And how are they computationally defined? Please provide more details about how the HBSCAN was used. In Figure S5, simulations #6 and #8 appear to have a 4th cell type (coloured in green), but the authors do not mention this result in the text. If cell types are defined by gene expression profiles, then the number of cell types will be dependent on the kind of clustering performed. Clarifying the definition of cell types will help resolve this issue.

      We thank the reviewer for raising this point and agree that the definition of cell types in our simulation results requires clearer explanation.

      The concept of cell type / cell identity is a complex theme which is still yielding interesting debate and discussion in the literature (see for instance Rafelski and Theriot, 2024). In our simulations, cell types are defined based on gene expression profiles rather than being explicitly identified as mathematical attractors of the underlying dynamical system. Operationally, we perform dimensionality reduction (UMAP) followed by clustering (using HDBSCAN) on the gene expression profiles across cells. This clustering serves as an initial, automated indication of distinct expression states across the petal.

      We recognise that the clustering results depend on the chosen dimensionality reduction and clustering method, as well as their parameterisation. For example, clustering applied to a smooth gradient (e.g., arising from diffusion alone) can artificially partition continuous variation into multiple discrete groups. For this reason, we do not rely solely on the clustering output: we use it as a first-pass classification and then manually verify the resulting groups by manually inspecting their gene expression profiles across the petal. This additional step ensures that identified "cell types" correspond to distinct expression states rather than arbitrary thresholds along a gradient. We have clarified both the computational procedure (dimensionality reduction + HDBSCAN clustering + manual verification) and the conceptual definition of cell types in the Methods section (lines 748-753).

      Regarding Figure S5, the fourth cell type (shown in green) in simulations #6 and #8 is indeed a distinct gene expression profile. We do occasionally observe the evolution of more and different cell types, this second boundary cell type being one of them, but also for example a salt-and-pepper type cell type (not shown). These cell types are however usually very transient and infrequent.

      * Rafelski, S.M. and Theriot, J.A., 2024. Establishing a conceptual framework for holistic cell states and state transitions. Cell, 187(11), pp.2633-2651.*

      2. In relation to the previous question, are the phenotypes used in the evolutionary simulations' steady states of the underlying dynamics?

      As clarified in response to Reviewer 1's comment 3, we do not explicitly require or enforce that phenotypes correspond to steady states of the underlying GRN dynamics. The developmental dynamics are always simulated for a fixed duration, and the fitness of a GRN is defined as the time-averaged gene expression pattern over a window at the end of this (lines 88-90) and Methods (lines 617-620).

      Because fitness is computed from this late-stage average, selection favors GRNs that produce consistent and stable expression patterns during that window. Networks that remain in strong transient or oscillatory regimes during this phase are typically penalised through reduced fitness.

      Therefore, while steady states are not imposed as a constraint, selection strongly favors solutions that are effectively stationary by the end of development. Indeed, inspection of the evolved GRNs shows that they converge to stable expression states.

      1. In Figure 3A it seems there are probably two cell types in the boundary region, is that right? Or are the elongated purple and elongated white cells basically the same cell type? Please clarify. If there are two, why did the authors choose to do the transcriptome analysis of the boundary region as one region, and not two subregions, to capture the two cell types?

      Correct, there are two different boundary cell types at the mature stage 5 petal: flat, elongated purple cells (lower boundary), and flat, elongated cream cells (upper boundary). However, the transcriptome data comes from an earlier stage (stage 2), where the boundary cells have not yet developed their characteristic shape and texture and the petal only comprises visibly pigmented (proximal) and non-pigmented (distal) cells. The morphological differences that distinguish the two boundary cell types at stage 5 are not yet apparent, hence we can only treat the boundary as one region at this stage, defined as the transition zone between pigmented and unpigmented cells

      We have made this distinction clearer in the figure caption of the Stage 2 petal (Figure 3B).

      1. I appreciate the explanation of the GRN pruning in the methods, but could the authors illustrate the network pruning process with an example and show that it works in this example?

      We have added a supplementary figure (Figure S12) depicting the pruning process for a GRN which keeps its boundary cell type during pruning and one for a GRN which loses its boundary cell type after pruning.

      1. From the methodological perspective, I suggest further clarifying what is new from this study and what is not. For instance, is the GRN pruning idea new or has it done before? The authors could consider reducing the formalities in the methods of the main text when they are not needed or when they are not new, to facilitate the readability of what is really important and novel in this work, and what is not. E.g., it is not really needed to mathematically define a Voronoi tessellation in the main methods section; this could be simplified or moved to a supplementary methods section.

      We agree that the distinction between methodological novelty and established components of the framework should be made clearer. We have therefore streamlined the description of non-novel methods and added appropriate citations to prior work where relevant, for example in the section on pruning.

      1. I believe the diffusion term used in Eqs. 14 and 17 does not conserve the total number of protein molecules; could the authors verify that? An example of a correct passive transport term for cell i of protein concentration p_i would be the sum of (p_j-p_i) for all j-cell neighbours, normalized by the area of cell i, or the formulation by Sukumar and Bolander (2003). This is especially important when noise is added, as the non-conservation of the number of proteins can lead to unwanted instabilities. Likely, these effects do not invalidate the results of the paper, but the authors should clarify the reason for their choice or double-check the conclusions using a correct, mass-conserved diffusion term.

      Thank you for pointing this out, this is indeed an error in our mathematical description. We double-checked our implementation, and confirmed our implementation correctly normalises by the area of cell i. We have a unit test which tests for mass conservation (https://gitlab.developers.cam.ac.uk/slcu/teamrv/evo-framework/-/blob/paper-2024-stoch-sims/tests/petal_test.cc?ref_type=tags#L66), which also confirms that our implementation is correct and this is only an error in the mathematical description in the paper. We have updated the equations to correctly reflect the implementation.

      1. It is important to facilitate the reproducibility of the results whenever possible, especially given that the computational framework used in this work has great value. I truly appreciate that the authors uploaded the code to a Gitlab. Please add further information in the readme file to facilitate reproducing the results, beyond the information regarding the code installation, whenever possible.

      We thank the reviewer for emphasising the importance of reproducibility. As noted in our response to Reviewer 1's comment 1, we have improved the structure and documentation of the public repository to facilitate reproduction of the results, including the SLURM scripts used for the evolutionary simulations and documenting code used for analysis and creating figures.

      Minor comments

      1. What is the reasoning behind the choice of the number of protein species? Why 12? Would the same results hold with a smaller number of proteins? As I imagine that the more species one considers, the more chances one has to get the desired phenotypes (or any desired phenotype for that matter). I could imagine that with 12 or more proteins, one could get more than 3 cell types (as defined by the clustering of their expression profiles). Is there something inherent in the creation of a boundary that leads to only 1 additional cell type and not more? Further simulations would be ideal to address this point, but otherwise, please comment on that if possible.

      As noted in our response to Reviewer 2's comment 4, the choice of 12 protein species is to some extent arbitrary. We selected this number as a compromise between providing sufficient degrees of freedom and maintaining computational feasibility of the evolutionary simulations. In a recently published manuscript from our team (van der Jagt et al., 2026), we tested the impact of reducing the number of genes and showed that important evolutionary dynamics are by and large the same.

      Regarding the possibility of obtaining more than three cell types: while rare, we do observe the emergence of additional cell types in simulation #6 and #8 in Figure S9. A larger number of proteins could in principle support more combinations of expression patterns, but the number of stable cell types that emerge is strongly determined by the fitness function and by the spatial structure of the task (i.e., generating two pre-specified domains). That is, the emergence of a single additional boundary cell type is driven primarily by the developmental and selective constraints, rather than being directly limited by the number of proteins in our simulations.

      van der Jagt, Pjotr L., Steven Oud, and Renske MA Vroomans. "System drift in the evolution of plant meristem development." PLOS Genetics 22.4 (2026): e1012089.

      2. What is the fundamental difference between Gene profiles I and II in generating cell types? If a cell type is defined by the specific expression of certain genes, then are not Gene Profiles I and II just different sides of the same coin? For instance, Gene profile I is characterized by the expression of a single gene at the boundary. Why do their simulations they do not obtain patterns where 2 genes are expressed in the boundary? Or 3? Or is there a fundamental difference in how these are generated, like the boundary being a stripe of a Turing pattern, or something similar? This also links with the work of Ding et al. and Lu et al.-which the authors mention in the introduction- where they propose that self-organized (Turing) patterns can explain anthocyanin patterning in petals. Could the authors clarify these points and maybe contextualize these results with previous works on petal patterns?

      The fundamental difference between the two gene profiles lies in how the boundary cell type is generated. In gene profile II, genes expressed in the boundary are also expressed in the proximal region, but some genes expressed proximally are not present in the boundary. The boundary cell type therefore emerges as the intersection of two differently-sized proximal bullseyes (Fig. 2B.ii). In gene profile I, by contrast, genes are more expressed in the boundary than anywhere else, producing a central striped expression pattern. While gene profile I can arise from profile II (Fig. S10), we also find cases where mechanism I appears independently, without mechanism II being present (Fig. S9; Simulation #25). This shows the two mechanisms are genuinely distinct, and we therefore treat them separately.

      Profile I includes infrequent cases where several genes are preferentially expressed at the boundary (see for example simulation #23 in Figure S9). As for why we rarely observe two or more genes uniquely expressed in the boundary, we are not sure, however we suspect this may relate to the limited number of distinct gene types available in our model, which constrains how many genes can play a flexible, boundary-specific role.

      Regarding the link to Turing patterns and the work of Ding et al. and Lu et al.: our model addresses the pre-patterning mechanism upstream of anthocyanin patterning, which subdivides the petal into distinct spatial regions. Based on evidence from Hibiscus, this pre-patterning is thought to be initiated by an asymmetric signal. The problem we investigate is therefore how an existing asymmetric signal is converted into a bullseye pattern, which is fundamentally different from Turing-type symmetry breaking from a uniform state. Our work thus complements Ding et al. and Lu et al. by addressing the upstream question of how the spatial regions that constrain these self-organised patterns to specific petal domains are first established. We have added a discussion of this connection in the Discussion section (lines 301-306).

      1. In relation to the previous point regarding the mechanisms underlying boundary formation, the authors could consider whether the theoretical works by the J. Sharpe lab on stripe formation might be relevant to cite (e.g., Cotterell and Sharpe 2010 or Jimenez et al 2015)

      We agree that they are relevant and have added a section about theoretical work on stripe formation as part of the discussion on novel phenotypes (lines 305-310).

      1. If possible, it would be ideal to have at least one video/animation of both the dynamics of each phenotype and the evolution of the phenotypes as their fitness increases, to see the evolutionary trajectories and test whether similar phenotypes can be achieved through different trajectories.

      We thank the reviewer for the suggestion, since the temporal dynamics can indeed be informative. We have added two supplementary videos (Video S1 & S2) illustrating the developmental dynamics of two GRNs: one that generates a boundary cell type via gene profile I, and one via gene profile II. These videos provide a clearer view of the developmental model's dynamics, and how boundary cell types emerge dynamically during development. References to these videos have been added to the main text immediately after introducing the two gene profiles.

      In addition, we have added two supplementary figures containing evolutionary trajectories: one tracing an individual's evolutionary trajectory including detailed changes in fitness and gene expression over time (Figure S8), and one showing the evolution of PROX and DIST expression during the early adaptive phase across the first 10 simulations (Figure S6).

      1. In the Discussion, I believe that the emergence of the novel cell type would benefit from stronger contextualization within known evo-devo frameworks. In particular, the authors describe that a new cell type emerges as a byproduct of the selection of a higher-order developmental process-the bullseye pattern with a clearly defined boundary-rather than through direct selection of the cell type itself. I am confident the authors know these phenomena have been discussed under the term spandrels (Gould & Lewontin, 1979), and have been the subject of extensive study and debate. While identifying traits as spandrels is complicated-largely because in practice we lack reliable frameworks to distinguish them from actual adaptations-the work presented here provides a plausible mechanism of how such features could arise. To me, this fact alone is interesting, as not many works (as far as I know) have addressed this problem explicitly. Maybe the authors want to emphasize this fact as a novelty of their approach. To be clear, I am not suggesting that the authors should adopt a specific terminology; rather, I believe that explicitly invoking the concept of spandrel would resonate with readers familiar with the foundations of evo-devo and would strengthen the main message of the paper.

      We thank the reviewer for this great suggestion. We have added a reference to Gould & Lewontin's seminal paper in our discussion, placing our findings in the context of spandrels (lines 320-323).

        1. *Some additional considerations related to figures

      Please change colours in the figures to be colour-blind whenever possible The stripes in the striped purple cell shown in Fig. 3A are not seen unless one zooms in on it; would it be possible to represent this differently? In Fig. 5 Aii and Bii, it would be easier for the reader to connect with the statements in the main text if the x-axis is x 1000 or x100 instead of x500 Perhaps clarify panel captions of Fig. panels 3C and 3D. Probably I am missing something basic, but I was also wondering how their numbers are connected to the numbers in the panel of Fig. 3F. Why does Fig. 3F have three subpanels? Is it because of different expression levels? Please clarify.

      We thank the reviewer for bringing this up. On revisiting our figures, we noticed some hard-to-distinguish colours for the common red-green colorblindness (deuteranopia). We have improved this by changing the reds closer to magenta, making the figures more accessible. We increased the size of the cartoon cell in Figure 3A and increased the contrast of the colours used to indicate the stripes. We have changed this to read x1000 to improve clarity. We have added the following text to the caption of Fig 3E, page 6, to clear this up: The number in the intersection indicates genes enriched in the boundary compared to both proximal and distal regions.

      The numbers within each non-overlapping portion of the circles indicate genes enriched in the boundary relative to only one region (proximal or distal), minus those shared in the intersection.

      Yes indeed, they represent different order of magnitudes in expression (high, medium, and low, respectively). We have clarified this in the caption of Figure 3F.

      1. Could the authors clarify the choice of using the Stratonovich approach in the stochastic simulations?

      We decided on the Stratonovich interpretation, as it is the interpretation that is most natural when comparing with the deterministic model, where we "turned off" the noise. With the Stratonovich interpretation, we can get a deterministic system by simply dropping the noise terms. Had we chosen the Ito interpretation, this same approach would require changing the dynamics of the deterministic system by including a noise-induced bias in the drift term.

      1. Note equations are referred to in the text as Eq. S (...) whereas they are not supplementary equations

      Thanks for pointing this out, we have fixed this in the revised manuscript.

      1. The code is very large (more than 1GB), and I believe much of the space is used by Voronoi tessellations. If the authors have the time and have the scripts generating the Voronoi tessellations, the authors could add them to the repository and ensure that these tessellations are generated during the simulations whenever needed (but I am aware that code organization takes time). I would recommend having the code also in a repository with a DOI (e.g., Zenodo or OSF).

      We have significantly reduced the repository size by removing some Voronoi tessellations that are not used in this work, and have created a DOI for the code (line 352).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Oud et al. explores the evolution of a developmental mechanism generating bullseye patterns in petals using evolutionary simulations of gene regulatory networks and transcriptomics data. The authors provide a plausible mechanism of how a novel cell type can emerge as a byproduct of selecting for a higher-order process-in this case, the establishment of a bullseye pattern with two clearly delineated regions. Moreover, the authors show that the emergence of the new cell type persists longer in their evolutionary simulations when the system is noisy, suggesting a functional role of the cell type in buffering developmental variability. The approach is very impressive, bridging in silico-generated GRNs that model a patterning process and evolve over generations, and in turn, combining them with transcriptome analysis experiments. However, precisely due to the complexity of the work done, I would like the authors to clarify and/or address key elements of the methodology, especially those related to the assumptions regarding the modelling approach and their implications for the validity of the results, as well as from the analysis.

      Major comments:

      1. There are some aspects to clarify; some are mentioned here, but others are mentioned in minor points.

      1.1. How are the cell types defined from the simulations? Are they attractors of the dynamics of the corresponding proteins? And how are they computationally defined? Please provide more details about how the HBSCAN was used. In Figure S5, simulations #6 and #8 appear to have a 4th cell type (coloured in green), but the authors do not mention this result in the text. If cell types are defined by gene expression profiles, then the number of cell types will be dependent on the kind of clustering performed. Clarifying the definition of cell types will help resolve this issue.

      1.2. In relation to the previous question, are the phenotypes used in the evolutionary simulations' steady states of the underlying dynamics?

      1.3. In Figure 3A it seems there are probably two cell types in the boundary region, is that right? Or are the elongated purple and elongated white cells basically the same cell type? Please clarify. If there are two, why did the authors choose to do the transcriptome analysis of the boundary region as one region, and not two subregions, to capture the two cell types?

      1.4. I appreciate the explanation of the GRN pruning in the methods, but could the authors illustrate the network pruning process with an example and show that it works in this example?

      1.5. From the methodological perspective, I suggest further clarifying what is new from this study and what is not. For instance, is the GRN pruning idea new or has it done before? The authors could consider reducing the formalities in the methods of the main text when they are not needed or when they are not new, to facilitate the readability of what is really important and novel in this work, and what is not. E.g., it is not really needed to mathematically define a Voronoi tessellation in the main methods section; this could be simplified or moved to a supplementary methods section. 2. I believe the diffusion term used in Eqs. 14 and 17 does not conserve the total number of protein molecules; could the authors verify that? An example of a correct passive transport term for cell i of protein concentration p_i would be the sum of (p_j-p_i) for all j-cell neighbours, normalized by the area of cell i, or the formulation by Sukumar and Bolander (2003). This is especially important when noise is added, as the non-conservation of the number of proteins can lead to unwanted instabilities. Likely, these effects do not invalidate the results of the paper, but the authors should clarify the reason for their choice or double-check the conclusions using a correct, mass-conserved diffusion term. 3. It is important to facilitate the reproducibility of the results whenever possible, especially given that the computational framework used in this work has great value. I truly appreciate that the authors uploaded the code to a Gitlab. Please add further information in the readme file to facilitate reproducing the results, beyond the information regarding the code installation, whenever possible.

      Minor comments:

      1. What is the reasoning behind the choice of the number of protein species? Why 12? Would the same results hold with a smaller number of proteins? As I imagine that the more species one considers, the more chances one has to get the desired phenotypes (or any desired phenotype for that matter). I could imagine that with 12 or more proteins, one could get more than 3 cell types (as defined by the clustering of their expression profiles). Is there something inherent in the creation of a boundary that leads to only 1 additional cell type and not more? Further simulations would be ideal to address this point, but otherwise, please comment on that if possible.
      2. What is the fundamental difference between Gene profiles I and II in generating cell types? If a cell type is defined by the specific expression of certain genes, then are not Gene Profiles I and II just different sides of the same coin? For instance, Gene profile I is characterized by the expression of a single gene at the boundary. Why do their simulations they do not obtain patterns where 2 genes are expressed in the boundary? Or 3? Or is there a fundamental difference in how these are generated, like the boundary being a stripe of a Turing pattern, or something similar? This also links with the work of Ding et al. and Lu et al.-which the authors mention in the introduction- where they propose that self-organized (Turing) patterns can explain anthocyanin patterning in petals. Could the authors clarify these points and maybe contextualize these results with previous works on petal patterns?
      3. In relation to the previous point regarding the mechanisms underlying boundary formation, the authors could consider whether the theoretical works by the J. Sharpe lab on stripe formation might be relevant to cite (e.g., Cotterell and Sharpe 2010 or Jimenez et al 2015)
      4. If possible, it would be ideal to have at least one video/animation of both the dynamics of each phenotype and the evolution of the phenotypes as their fitness increases, to see the evolutionary trajectories and test whether similar phenotypes can be achieved through different trajectories.
      5. In the Discussion, I believe that the emergence of the novel cell type would benefit from stronger contextualization within known evo-devo frameworks. In particular, the authors describe that a new cell type emerges as a byproduct of the selection of a higher-order developmental process-the bullseye pattern with a clearly defined boundary-rather than through direct selection of the cell type itself. I am confident the authors know these phenomena have been discussed under the term spandrels (Gould & Lewontin, 1979), and have been the subject of extensive study and debate. While identifying traits as spandrels is complicated-largely because in practice we lack reliable frameworks to distinguish them from actual adaptations-the work presented here provides a plausible mechanism of how such features could arise. To me, this fact alone is interesting, as not many works (as far as I know) have addressed this problem explicitly. Maybe the authors want to emphasize this fact as a novelty of their approach. To be clear, I am not suggesting that the authors should adopt a specific terminology; rather, I believe that explicitly invoking the concept of spandrel would resonate with readers familiar with the foundations of evo-devo and would strengthen the main message of the paper.
      6. Some additional considerations related to figures:

      9.1. Please change colours in the figures to be colour-blind whenever possible.

      9.2. The stripes in the striped purple cell shown in Fig. 3A are not seen unless one zooms in on it; would it be possible to represent this differently?

      9.3. In Fig. 5 Aii and Bii, it would be easier for the reader to connect with the statements in the main text if the x-axis is x 1000 or x100 instead of x500

      9.4. Perhaps clarify panel captions of Fig. panels 3C and 3D. Probably I am missing something basic, but I was also wondering how their numbers are connected to the numbers in the panel of Fig. 3F.

      9.5. Why does Fig. 3F have three subpanels? Is it because of different expression levels? Please clarify. 10. Could the authors clarify the choice of using the Stratonovich approach in the stochastic simulations? 11. Note equations are referred to in the text as Eq. S (...) whereas they are not supplementary equations. 12. The code is very large (more than 1GB), and I believe much of the space is used by Voronoi tessellations. If the authors have the time and have the scripts generating the Voronoi tessellations, the authors could add them to the repository and ensure that these tessellations are generated during the simulations whenever needed (but I am aware that code organization takes time). I would recommend having the code also in a repository with a DOI (e.g., Zenodo or OSF).

      Referee cross-commenting

      The comments by other referees are complementary to mine; there are some common aspects with my comments and other important points to look into.

      Significance

      This study provides a plausible explanation of how new cell types can emerge as byproducts of the selection of other processes. This is an important advance in understanding the mechanisms underlying the origin of evolutionary novelties, particularly from the point of view of morphogenesis and patterning, rather than from a more traditional, strictly gene-centric views which focus on changes in specific loci, gene duplications, or neofunctionalization. By highlighting evolutionary novelty as a consequence of higher-order constraints, this work broadens the frameworks through which cellular diversity can be understood.

      I believe most of the limitations of the study are conceptual and regarding improving clarity rather than methodological. For instance, the definition of what a cell type is remains, in my opinion, somewhat vague, especially if the clustering has been performed with only 12 genes. However, I am aware of the conceptual difficulty in defining cell types in general. In addition, the emergence of only a single additional cell type, rather than multiple types, might be a consequence of the limited number of proteins considered. Aside from these issues, the methodology is sound and provides a useful framework for exploring the origin of novel cell types.

      I see this work as being of substantial interest to researchers concerned with the conceptual foundations of evo-devo, particularly those interested in the origins of novelty and in the role of constraints in shaping such novelty. It should also be relevant to studying morphogenesis from a dynamical systems perspective. Finally, this work will be of interest to those investigating the ecological roles of petal patterns, especially in relation to their roles in attracting pollinators or protecting reproductive organs from environmental factors.

      Overall, I think this work represents a very valuable contribution to the evo-devo community, providing conceptual advances into our understanding of the emergence of novelty, as well as providing a complex computational framework addressing cellular patterning in evolving GRNs.

      Field of expertise: developmental biology, nonlinear dynamics, pattern formation, evo-devo.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript entitled "Recurrent emergence of boundary cell types during evolution of floral bullseye patterns" Oud et al use computational modeling to determine how gene regulatory networks can set up the prepattern for a bullseye pigmentation. They use a modeling template that is similar to the hibiscus petal primordium, create a gene regulatory network composed of the interaction of cell autonomous transcription factors, transcription factors that can diffuse from one cell to another, and cell-cell communication signals. Each simulation started from a diffusing signal from the base and all other genes with no expression. Such a signal diffusing from the base of the organ has been hypothesized many times in plant morphogenesis, so this is plausible. They started 35 populations with initially random GRNs and let them evolve for 30,000 generations selecting for simulated petals with higher bullseye fitness in each generation. All 35 generated bullseyes. The authors used a UMAP dimensionality reduction similar to single cell RNA-seq to identify different cell types in the models. I have not seen this analysis applied to modeling before, and I thought this approach was innovative. Interestingly 26 out of the 35 initiated a boundary cell type to help in the robust establishment of the symmetric bullseye, whereas 9 did not. There are two major ways these boundary cell types is established: (1) boundary specific gene expression and (2) two nested proximal genes with one extending beyond the other. Then the authors examine real hibiscus petals and identify boundary cells, which express 30 boundary specific genes. The authors then examine one of the GRNs from one of their populations and find that gene 5 is crucial for setting up the boundary. Finally, the analyze over evolutionary time in each population and see that these boundary cells come and go in the lineages, but they have a longer persistence time when there is noise in the modeling, suggesting that they add robustness to the generation of the bullseye.

      Major comments:

      There is a major missed opportunity to analyze the evolved networks. Only one of the 30 GRNs is analyzed in figure 4. Please add further analysis of the GRNs from all the populations. Within a population after 30K generations, how much variation is there in the GRNs of individuals? How similar are the optimal fitness evolved GRNs across all 35 populations? Are there common motifs across networks? Is there always an antagonism between proximal and distal proteins somewhere in the network? A lot of previous work on GRNs has established the function of common motifs, and these should be analyzed. Please provide all 30 gene regulatory networks in the supplement.

      The purpose and significance of examining the evolutionary lineage is not clear. Please explain your logic. This is most important for Figure 5 where it becomes clear that the boundary cells are often formed transiently in the evolution of the GRN. If this boundary cell type does not persist, how can it help the petal generate a bullseye. What happens after the boundary cell type is lost? Has the GRN evolved into a more stable place where it no longer needs the boundary? In several instances it looks like they come and go many times. Please explain how these transient boundary cells in the evolutionary lineage can make a difference. This point also comes up in lines 113-115 "For each simulation, we traced back the ancestral lineage of the final fittest individual and sampled 12 of its ancestors at evenly spaced generational intervals, performing this analysis on each sampled ancestor." I could understand if the boundary cell type were developmentally transient, but I have a hard time what its significance is since it is evolutionarily transient.

      It is worth saying more about how the 9 lineages without a boundary cell types manage to make a robust bull's eye pattern because this is also interesting.

      How were 12 proteins chosen for the network, as opposed to 6 or 20 for instance? In the network pruning, it seems like fewer proteins are required. How many proteins are required to produce a bulls eye pattern?

      Minor comments:

      The title needs to be changed to include computational modeling or simulation because otherwise the current version of the title implies that these boundary cell types are found in plant species evolution.

      Line 103 - 106 "We found that over a third of all simulations evolved a bullseye size of approximately 50% of the petal's central height (Figure 2A.ii). This indicates a tendency for simulations to converge toward these proportions, possibly due to the interaction between the patterning signal distribution and the tissue geometry." The phrasing here is confusing. Which proportions does "these proportions" refer to? Presumably, 50% from the preceding sentence. But the second proportion is not clear from the text. Maybe it is the peak at approximately 65% seen in the graph. Please clarify in the text.

      Line 118 "To further explore cell identity in the third cluster, we analysed the gene expression profiles of the three identified cell types." It is not clear what the third cluster refers to. The previous sentence mentions 9 lineages without boundary cell types. So, a transition here back to lineages with boundary cell types, would help here.

      Figures 3C-D, it would help to label these volcano plots proximal versus boundary and distal versus boundary. Although they do fit your color scheme and legend for the color scheme, it is important to specify it explicitly.

      On Figure 4A it would help to label which gene is Prox and Dist. I assume they are the purple and yellow genes, but it would be easier if they were labeled.

      Line 185-186 "Gene 5 delays and spatially restricts the expression of gene 10, ensuring the symmetric development of the pattern." This statement needs to be supported by showing a time series simulation-movie or timepoints-revealing this timing aspect of Gene 5.

      Referee cross-commenting

      I agree with all reviews, which are aligned.

      Significance

      How pigment patterns in petals are established is an important and fascinating question, that sheds light on broader issues of how tissues are pre-patterned. Previous studies focus on the reaction diffusion gene regulatory networks that create beautiful petal pigment spot patterns. This paper fills the gap in addressing how a prepattern is established to create a simple proximal distal bullseye pigment pattern. Overall, the use of modeling in this study raises several novel and exciting hypotheses for how a pre-pattern can be established during development. One limitation of the study as acknowledged by the authors is that the actual petal grows, whereas the model does not. Although growth is likely to make an interesting contribution to the pattern, I agree that it is beyond the scope of this manuscript. Modeling papers are always challenging to write clearly, and I point out some areas where clarifications are needed below. The figures illustrate the results well.<br /> This paper will be of interest to developmental biologists, gene regulatory network afficionados and computational biologists.

      My expertise is in plant morphogenesis and patterning as well as computational modeling.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript presents the findings of a computational investigation, whereby populations of artificial "genomes" and their products are evolved algorithmically. They are subjected to a fitness constraint defined in terms of a spatial expression pattern on a petal shaped template. The specific focus of this work is the formation of two-pigment patterns on flower petals, which give rise to "bullseye" patterned flowers. A computational survey suggests that besides the two main genetic identities which are strictly required to form such patterns, a third population is likely to emerge, as a marker located at the interface between the two main identities. This prediction is then tested by dissecting petals of Hibiscus trionum and performing an mRNA-seq survey. The resulting data set is consistent with the simulations, with a population of genes specifically expressed at the boundary between the two main regions. The paper then discusses a number of hypotheses on the evolution of underlying gene regulatory networks, testing them computationally. In particular, by comparing simulations with and without stochastic terms in the dynamics of gene regulation/expression, it is suggested that the 3rd identity is contributing to robustness of the pattern in the face of noise. Overall the main text is clear and makes an interesting case.

      Major comments:

      1. The code used for simulations is available on a public repository, but it does not directly ensure that results are reproducible. To do so would require a clear step-by-step guide referring the user to the specific pieces of code which have been used for the results and figures presented in the paper. At the moment, I could not find any such guide and the large number of scripts, executables and jupyter notebooks are not clearly linked to the paper's contents.
      2. The methods themselves involve a number of arbitrary choices. Though this is understandable given the nature of the work, one aspect in particular that would deserve better clarity is the modeling of gene network dynamics. The stochastic model (l.516 & following) involves a nesting of "Hill-like" terms (those in Eqs. (7) and (11)) which is unusual and given without justification. There should be some explanation of how this approach relates to standard approaches such as those reviewed e.g. in: Bintu et al. Current opinion in genetics & development 15.2 (2005): 116-124.

      3. It is also unclear at the moment how exactly the GRN dynamics is used; are time-stepping algorithms used until the system reaches a stationary regime? If so, how is stationarity assessed? This needs to be explained both in the main text and in the methods. The table of parameters suggests that there was a cut-off time, but there is no explanation whatsoever about the state of the dynamics at this time.

      4. Related to the previous point, the table of parameters (Table S1) is provided without any explanation; through what process (exploratory, literature review, trial and error...) where the values selected? As there been any type of sensitivity analysis?

      Minor comment:

      1. The fitness function used in simulations specifically encodes the desired pattern, with two zones having differential gene expression. This allows the artificial selection to evolve towards such patterns, as expected, but it is not entirely clear how this relates to natural selection itself. At the very start of the paper, the authors briefly review some possible sources of selective pressure for flowers to exhibit patterns such as bullseye, among others. None of the selective factors would likely act on the plants as a direct incentive for two regions, as specified in the cost function. Instead, one may expect a more high level criterion, such as "conspicuousness" for a pollinator, for instance. This is admittedly not naturally represented as a fitness function, but the choice of this function definitely influences the outcomes of a simulation. Some further numerical experiments may allow to demonstrate that the exact cost function is not critical for the findings of the paper, but I understand they would likely be computationally costly, to the point of unfeasibility. This limitation should be mentioned at least.
      2. [optional suggestion] The number of genes used in the simulations is very small in comparison to real organisms. This is clearly justified by the complexity of the work, but one wonders if simulations could be made more efficient by using a much simplified approach for the gene network dynamics. At the time scales of interest, it seems that the use of SDEs and the numerical intricacies they require might be an unnecessary burden. Have the authors considered a much simpler approach, for instance based on Boolean models? Since the study only uses static tissues, all the GRN dynamics could be by-passed, determining steady states very quickly and using them to determine fitness. If this saved significant computational time, this would allow a more comprehensive survey of the "purely genetic" part of the model.

      Referee cross-commenting

      I agree with both other reviewers. As mentioned by them, our reviews bring complementary suggestions, while being overall in good agreement.

      Significance

      Reviewer's expertise: mathematical modeling, mathematical biology.

      This paper is mostly a conceptual study, in which the majority of results are based on computer simulations. The findings are biologically interesting, but it is hard to prove these evolutionary claims through physical experiments. The complexity of the simulations requires a large number of technical assumptions and parameter choices, which overall make it very difficult to assess how plausible these simulations are, compared to the natural processes they are meant to represent. All the findings are well-argued and provide an overall convincing case, but it is by design impossible to fully assess experimentally. As such, this work will be mostly valuable to theoretical biologists, computational modelers, and researchers interested in "artificial life" and gene evolution.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Point

      Summary

      Response

      1.1

      Overall, the study lacks well-controlled experiments comparing hypoxia induced by DMOG with hypoxia induced by 1% O₂ for assessing ERα occupancy throughout.

      To assess whether DMOG-induced changes in ERα occupancy reflect bona fide hypoxia, we measured ERα binding by ChIP-qPCR under 1% oxygen over 48 hours, compared to normoxic (21% oxygen) cells and input controls in matched cells at the GREB1 and TFF1 loci. Our findings demonstrate that 1% oxygen treatment recapitulates the ERα binding changes observed with DMOG, at the time points of our RNA-seq experiments.

      We have included these results in __Figure 1F __of the preliminary revision of the manuscript.

      1.2

      Lack of evidence for other co-transcription factors impact under hypoxia HIF's in Fig1.

      We thank the reviewer for this comment. We have clarified that motif enrichment analysis is included to characterise the sequence context of ERα binding sites and to confirm enrichment of known ER-associated motifs (e.g. EREs), rather than to infer functional involvement of additional transcription factors under hypoxia. Corresponding interpretative statements have been removed from the Results and restricted to the Discussion.

      1.3

      Lack of evidence for DMOG induce HIF protein expression in MCF7 cells.

      To confirm DMOG induces HIF-protein expression we have analysed HIF1α and HIF2α protein levels by western blot. We have included these in __Supplementary Figure S1A __within the preliminary revision to address this concern.

      1.4

      Figure 1: ATAC-seq was performed under 1% O₂, whereas ChIP-seq was conducted with DMOG treatment, making these conditions not directly comparable.

      We acknowledge that the ERα ChIP-seq (DMOG) and ATAC-seq datasets were generated under different conditions and are therefore not directly comparable. To address this, we have performed ChIP-qPCR under bona fide hypoxia (1% oxygen) at canonical ERα target loci (TFF1 and GREB1), demonstrating that the directionality of ERα binding changes observed with DMOG is recapitulated under physiological hypoxia. These data provide a direct comparison of ERα occupancy across conditions and support the use of DMOG as a proxy for hypoxia in our ChIP-seq experiments.

      If requested, we are willing to perform ATAC-seq at 16 h under 1% oxygen. However, because the original dataset was generated under 0.1% oxygen, and canonical ERα-bound sites show minimal accessibility changes under severe hypoxia, we anticipate limited additional insight from repeating this experiment.

      1.5a

      Figure S1: ERα ChIP lacks estradiol (E2) treatment in MCF7 cells with or without DMOG.

      The statement that the ERα ChIP samples lack estrogen treatment is incorrect. Estradiol was not an experimental variable and cells were intentionally maintained under estrogen-rich conditions to preserve tumour-relevant ERα activity.

      We have now clarified within the preliminary revision by stating that cells were routinely cultured in “estrogen-rich Dulbecco’s Modified Eagle Medium” in the methods section, and clarified the use of estrogen-rich conditions in the Figure S1 legend.

      1.5b

      The single-gene examples of DMOG effects shown in Fig. S1A are not significant.

      The peak illustrated in Figure S1A (now Figure S1D) __is intended to provide a visual confirmation of peak calling and enrichment patterns underlying the genome-wide redistribution observed in __Figure 1. The peak was called by the MACS2 pipeline (code available from https://doi.org/10.5281/zenodo.17221105) with a log10(q-value) = 268.5, which passes the MACS2 cut-off q

      1.6a

      Fig. S2 lacks 1% O₂ conditions,

      We wish to clarify that Figure S2 (now Figure S4) serves as quality control specifically for the DMOG-treated ChIP-seq dataset presented in Figure 1C. The purpose of the plot is to visualize unfiltered motif enrichment to confirm that the identified peaks represent bona fide ERα binding events within the DMOG condition. Motif enrichment under a 1% oxygen environment would not provide this validation. In all cases the ERE is the most significantly enriched motif.

      With respect to ERα binding under 1% oxygen, we have now assessed this via targeted ChIP-qPCR validation (Figure 1F).

      1.6b

      Fig. S3 lacks DMOG-induced HIF factor assessments.

      The DMOG-induced changes in HIF1α and HIF2α expression are shown in the__ Figure S1__ of this revision proposal and have been incorporated into the manuscript as part of the changes described in response 1.3.

      1.7a

      Figure S4: Estradiol (E2) treatment is missing from the controls, and the figure labeling is of poor quality.

      We have substantially improved the labelling of Figure S4, now__ Figure S6.__

      Additionally, we have clarified that all samples were cultured in estrogen-rich media and treated with either vehicle control or 100 nM fulvestrant; thus estrogen is present in all conditions including the controls.

      1.7b

      Hypoxic conditions for assessing ER status and appropriate controls are also lacking.

      We agree that monitoring ERα stability under hypoxic conditions is essential.

      We provided a western blot assessment of ERα protein levels at 0, 8 and 48 hours of treatment with 1% oxygen or DMOG, compared to normoxic controls, included as Supplementary Figures S1B, C in the preliminary revision.

      These demonstrate the cells remain positive for ERα protein expression at 0, 8 and 48h.

      1.8

      Figure S5: The description of fulvestrant treatments under hypoxic conditions is unclear.

      We thank the reviewer for this comment. To clarify the experimental design, we now signpost the reader in the figure legend of Figure S5 (now S7) to the schematic diagram provided in Figure 3B, and provide a summary stating the experiment employed a factorial design combining a 96-hour fulvestrant treatment with exposure to 1% oxygen for the final 48 hours.**

      1.9

      Supplemental legends: These require major revision; they are of poor quality and lack statistical details and references to biological replicates.

      We have extensively revised all supplementary figure legends to ensure clarity and precision.

      1.10

      Overall comparisons throughout the manuscript are weak; the figures appear sloppy and lack sufficient effort in presentation.

      Following this comment, we carefully reviewed the presentation of all figures throughout the manuscript. We improved the organisation and labelling of the Supplementary Figures to facilitate clearer comparison of the data. In particular, full western blots are now clearly annotated and supplementary legends have been expanded to provide sufficient context for each figure to be interpreted independently.

      1.11

      i) In general, the manuscript in its present form does not greatly contribute from published work as the ERα cistrone is well documented work studied for its role in regulating gene expression, particularly in ERα-positive breast cancer.

      ii) Additionally, a lack of a thorough comparison between DMOG and or 1 %oxygen induce hypoxia in the MCF7 ER+ model, diminished initial interest in the manuscript.

      iii) The lack of considering estradiol exposure under hypoxic conditions with either 1%oxygen and or DMOG also limits relevance to patients with ER+ BrCa.

      iv) The ERα epigenomic profile has been extensively studied including work under hypoxic conditions.

      i) We respectfully disagree that the manuscript does not extend prior work. Despite extensive characterisation of ERα, its role in shaping hypoxia-driven transcription in ER+ breast cancer has not been defined. Here, we identify an ERα-dependent hypoxic response (EDHR), demonstrating a reciprocal interaction between hypoxia and ERα activity.

      ii) In revision, we address concerns regarding DMOG by validating ERα binding under 1% oxygen using ChIP-qPCR thereby confirming our result in bona fide hypoxia. Additionally, all RNA-seq and functional assays, including ENaC targeting, were performed under 1% oxygen in the original manuscript.

      iii) All experiments were conducted under estrogen-complete conditions, now explicitly clarified, reflecting tumour-relevant ERα activity.

      iv) Together, these data establish a reciprocal interaction between ERα and hypoxia and uncover a targetable vulnerability in hypoxic ER+ breast cancer, linking transcriptional regulation to therapeutic opportunity.

      Reviewer 2

      No.

      Summary

      Response

      General Comments

      2.1

      ENAC is proposed as a therapeutic vulnerability based on amiloride sensitivity assays. Additional experiments are required, such as western blot validation of ENaC regulation under hypoxia and loss-of-function approaches to assess its contribution to the phenotype.

      We agree that further validation of ENaC involvement would strengthen this observation. We will assess ENaC protein levels under 1% hypoxia ± fulvestrant by western blot and perform siRNA-mediated depletion of ENaC subunits to test their contribution to the hypoxia-specific amiloride-sensitive phenotype by viability assay (see also response 3.3).

      2.2

      Fulvestrant is used to dissect ERa dependency. However, as a SERD, it may alter chromatin and transcription independently of a simple loss of ERα. Addition control would strengthen interpretation.

      The experimental design already controls for potential fulvestrant-specific transcriptional effects, as all four conditions (± hypoxia, ± fulvestrant) were included. EDHR genes were defined based on induction under hypoxia, loss of this induction following ERα degradation, and absence of residual hypoxic induction in the presence of fulvestrant. Consistent with this, SCNN1B and SCNN1G do not show significant fulvestrant-responsive changes under normoxia (Figure 5C,D).

      We also note that fulvestrant has been shown to induce minimal global chromatin remodelling (Guan et al., 2019), supporting its use to assess ERα dependency without broadly confounding chromatin accessibility; this reference is now included in the manuscript.

      2.3

      The molecular mechanism by which ERα modulates the hypoxic transcriptome, specifically how ERα and HIF pathways converge at ENAC loci should be more studied.

      We further examined the potential convergence of ERα and hypoxic signalling at the ENaC loci (included as __Figure 5E __in the revision proposal) showing genome browser views of the SCNN1G and SCNN1B loci, highlighting hypoxia-induced HIF1α binding and ERα association at these sites.

      To further support this, we will perform RT-qPCR validation of SCNN1G and SCNN1B expression following treatment ± IOX5 and ± fulvestrant. IOX5 is a selective PHD inhibitor that stabilises HIF proteins, enabling us to assess the contribution of HIF signalling independently of other oxygen-dependent effects associated with hypoxia.

      2.4

      In addition, to assess the relevance of this work for luminal breast cancer and ERα expression, specific validation in TNBC should be performed

      To assess the clinical relevance of SCNN1B and SCNN1G in ER-positive and ER-negative subgroups, we performed Cox proportional hazards analyses in TCGA and METABRIC cohorts individually, including ER status and stratifying by ER-positive and ER-negative cases (Figure 6C). These analyses support the association of SCNN1G with poorer relapse-free survival specifically in ER-positive patients.

      2.5

      The authors should provide RT-qPCR validation of the key EDHR genes, especially since this signature is later used for downstream analyses.

      We agree that independent validation would strengthen these findings. We will perform RT-qPCR validation of key EDHR genes (including SCNN1B and SCNN1G) under ± hypoxia and ± fulvestrant conditions to confirm ERα-dependent hypoxic induction.

      Limitations

      2.6

      Reprogramming of the ERα cistrome under cellular stress is well documented. The study extends these ideas but does not clearly demonstrate a new mechanistic paradigm, particularly because the EDHR is defined primarily through omics approaches without strong mechanistic validation. In addition, we have to keep in mind that the study uses DMOG to model hypoxia-driven chromatin changes, but DMOG inhibits many 2-oxoglutarate-dependent dioxygenases non-selectively.

      This makes it difficult to attribute ERα cistrome reprogramming specifically to hypoxia, rather than to broad off-target effects. The transcriptomic dataset is more convincing by need the validation suggested previously.

      While ERα cistrome reprogramming has been described, our study demonstrates a reciprocal interaction in which ERα not only responds to hypoxia but actively shapes hypoxia-driven transcription, defining an ERα-dependent hypoxic response (EDHR).

      We acknowledge the limitations of DMOG and have addressed this by validating key ERα binding events under bona fide hypoxia (1% oxygen) using ChIP–qPCR, confirming our findings under physiological conditions (response 1.1).

      To further strengthen mechanistic insight, we will assess the requirement for HIF stabilisation using the selective PHD inhibitor IOX5, combined with RT-qPCR analysis of SCNN1G and SCNN1B ± IOX5 ± fulvestrant (response 2.3 and 2.5). In addition, we will validate the functional relevance of ENaC through protein-level analysis and siRNA-mediated depletion, as described in__ response 2.1.__

      Together, these additions address concerns regarding DMOG specificity and provide further support for a functional interaction between ERα and hypoxic signalling.

      Audience

      2.7

      Given its reliance on omics datasets and preliminary functional assays, the paper will likely appeal to a specialized audience in transcriptional regulation, hypoxia signalling, and ER+ breast cancer biology. However, the limited mechanistic depth and uncertain translational relevance due to the lack of in vivo validation, may reduce its impact for broader oncology or therapeutic-development audiences. Without stronger validation, the findings may be perceived as niche and mainly of interest to researchers focused on ERα chromatin dynamics rather than to the wider cancer research community.

      The study incorporates multiple layers of human relevance, including spatial transcriptomic analyses demonstrating enrichment of EDHR within hypoxic tumour regions and survival analyses linking EDHR and ENaC expression to clinical outcome.

      In revision, we address the reviewer’s concerns through targeted validation (ChIP-qPCR in hypoxia, western blotting, and RT–qPCR). Together, these additions strengthen the mechanistic and translational relevance of the study.

      Reviewer 3

      No.

      Summary

      Response

      Major comments

      3.1

      The DMOG ChIP-seq provides a valuable first look at ERα redistribution. Since DMOG inhibits both HIF hydroxylases and oxygen-dependent demethylases, the driver of the observed changes remains ambiguous. It would help to include either ERα ChIP-seq under bona fide hypoxia or a selective PHD inhibitor condition (for example IOX5, as you discuss) to separate HIF stabilisation from broad demethylase inhibition. If ChIP-seq is not feasible, a brief ATAC validation at a small panel of gained and lost loci would still increase confidence.

      We acknowledge that mimetics of hypoxia can introduce off-target effects. To address this, we have validated our ERα ChIP-seq findings using ChIP-qPCR at representative loci (TFF1 and GREB1), demonstrating consistent changes in ERα binding under bona fide hypoxia (1% oxygen) (now included in Figure 1F).

      As acknowledged by the reviewer, ChIP-seq under these conditions is likely not feasible due to cell number constraints. We are willing to undertake ATAC-seq if required (as stated in response 1.1); however, we do not feel it would directly address ERα occupancy at these loci. We therefore consider our targeted ChIP-qPCR to be the most appropriate approach to validate ERα redistribution under hypoxia.

      3.2a

      The factorial RNA-seq is well designed and the attenuation analyses are clear. The EDHR selection is stringent and reproducible across two ER+ lines.

      To support the claim of ERα dependence mechanistically, a small number of targeted perturbations would go far. For example,

      i) confirm EDHR induction for SCNN1B and SCNN1G in hypoxia with and without fulvestrant by RT-qPCR

      We agree that targeted validation would strengthen the mechanistic support for ERα dependence. We will perform RT-qPCR validation of SCNN1B and SCNN1G under hypoxia ± fulvestrant to confirm ERα-dependent hypoxic induction (see also response 2.5).

      3.2b

      ii) test whether short-term ERα knockdown reproduces the effect.

      ERα dependency is already assessed through fulvestrant-mediated degradation within the factorial design, which provides a well-established and direct approach to evaluate ERα function. As EDHR genes are defined by loss of hypoxic induction following ERα degradation, this constitutes a robust assessment of ERα-dependent effects.

      We will therefore focus on orthogonal validation through RT-qPCR (response__ 2.5__), together with additional mechanistic and functional analyses using IOX5 and ENaC perturbation (responses 2.1 and 2.3), rather than introducing an ERα knockdown approach, although we would consider this if required.

      3.2c

      iii) A complementary test with a HIF-1α or HIF-2α knockdown at one time point would help position EDHR relative to HIF.

      This request aligns with point 2.3, which addresses the convergence of ERα and HIF signalling. While HIF knockdown under hypoxia would assess necessity, we will instead assess the contribution of HIF signalling using the selective PHD inhibitor IOX5, as this allows us to isolate HIF stabilisation from broader hypoxia-associated effects and avoids additional perturbation associated with transfection-based approaches. We will perform RT-qPCR analysis of SCNN1B and SCNN1G following treatment ± IOX5 ± fulvestrant to determine whether HIF stabilisation is sufficient to support ERα-dependent induction of EDHR genes.

      3.3

      The amiloride result is intriguing and consistent with a hypoxia-specific dependency. Because amiloride is pleiotropic, it would strengthen the conclusion to add one genetic and one pharmacological specificity control. A brief SCNN1B or SCNN1G knockdown in hypoxia should phenocopy the viability effect if ENaC contributes. In parallel, testing benzamil at sub-micromolar doses would provide a more ENaC-selective pharmacological readout. These can be performed in MCF7 and, resources permitting, in T47D.

      To address the reviewer’s concern regarding pleiotropic effects, we propose (aligning with our__ response to 2.1__) to apply siRNA-mediated knockdown of SCNN1B and SCNN1G under hypoxia to determine whether this reproduces our observed viability effect, thereby providing direct evidence for ENaC involvement.

      We agree that additional pharmacological validation could further support specificity, and would consider inclusion of a more ENaC-selective inhibitor if required.

      3.4

      The RFS associations for

      SCNN1B and SCNN1G are compelling. It would be helpful to report whether the associations persist in a multivariable model that at least includes ER status, grade and nodal status where available, or to state clearly when this is not possible across merged datasets. Even a sensitivity analysis in TCGA with ER+ cases only would contextualise the hazard ratios.

      We have analysed TCGA and METABRIC cohorts individually using Cox proportional hazards models, as this functionality is not available for merged datasets in KMplot. ER status was included in the models, and analyses were additionally stratified by ER-positive and ER-negative subgroups. The number of relapse events per subgroup is approximately 40; therefore, additional covariates such as grade and nodal status were not included given the limited number of events per model.

      Within ER-positive patients, high SCNN1G expression is associated with poorer relapse-free survival (TCGA HR 1.45, p = 0.0027), while SCNN1B shows a similar trend that does not reach statistical significance. These analyses are presented in Figure 6C and in the results section of the preliminary revision, and support the findings from the Kaplan–Meier analysis.

      3.5

      The spatial association of EDHR with EMT hotspots is a nice piece of the story. A short clarification of how spot-level cell type composition was handled will help readers interpret proximity results. If cell type deconvolution scores are available in the source dataset, adding a sentence on whether EDHR enrichment tracks tumour epithelial content would be useful.

      Spatial cell type composition and spot annotations were used as provided in the SpottedPy dataset, based on Cell2location-derived deconvolution scores and STARCH tumour annotations, without additional re-estimation.

      To address the reviewer’s suggestion, we examined the relationship between EDHR enrichment and epithelial content and observed no significant correlation at the neighbourhood level.

      These points have now been clarified in the manuscript.

      3.6

      Data processing for ChIP-seq and RNA-seq is documented and accessions are provided. The RNA-seq includes n=3 per condition, which is appropriate, and the correlation and LFC analyses are clearly presented. For the amiloride assay, the two-way ANOVA with interaction is appropriate; please add the exact n and whether experiments were independently repeated, and include the underlying values in a source table for transparency. These are small presentational edits rather than new experiments.

      In the preliminary revision we have added a statement to the amiloride assay figure (Figure 6D) clarifying that n = 3 independent biological replicates were performed per condition. In addition, we now provide the underlying numerical values for this assay in Table S11.

      3.7

      A small, hypothesis-driven mechanistic link from EDHR to ENaC function would substantially elevate impact without becoming a long project. For example, testing whether hypoxia increases amiloride-sensitive Na⁺ current in MCF7 and whether fulvestrant abrogates that increase would directly connect the transcriptional and functional observations. If available, patch-clamp or a simple SBFI-based Na⁺ imaging readout could suffice.

      We agree that directly linking EDHR to ENaC channel activity would further strengthen the mechanistic connection. We will prioritise genetic validation of ENaC function through siRNA-mediated depletion (response 2.1), which directly tests the requirement for ENaC in the hypoxia-specific viability phenotype.

      We are willing to explore the feasibility of measuring the amiloride-sensitive Na+ currents under normoxia and acute hypoxia (via perfusion of cells with bathing solution bubbled with nitrogen during recording) ± fulvestrant to further connect hypoxic regulation to channel activity.

      Minor Comments

      3.8

      Please show representative ERα ChIP-seq browser snapshots for at least one gained, one conserved and one lost locus alongside input for both conditions.

      We have now included representative ERα ChIP-seq browser snapshots for gained, conserved, and lost loci, together with input controls for both conditions, in Figure S3 of the revised manuscript.

      3.9

      In Figure 1D, the ATAC-seq comparison uses 0.1% O₂ for 48 h while the RNA-seq uses 1% O₂. Briefly justify the choice and discuss any expected differences.

      We thank the reviewer for this point. The ATAC-seq dataset was generated under 0.1% oxygen in the original study, whereas RNA-seq experiments in this work were performed at 1% oxygen to reflect tumour-relevant hypoxic conditions. The more severe hypoxia used for ATAC-seq would be expected to maximise detection of chromatin accessibility changes. Despite this, chromatin accessibility changes were limited, with ERα binding occurring predominantly at pre-accessible regions. This has now been clarified in the manuscript.

      3.10

      In the Methods for spatial analyses, specify the thresholds for hotspot calling and how the neighbourhood radius was chosen.

      The neighbourhood parameter was set to 8, corresponding to the immediate neighbouring spots in Visium data, consistent with package guidance. We have clarified this in the manuscript text.

      3.11

      For the EDHR heatmap, consider marking the 14 consensus genes and indicating which belong to the ENaC module to aid readability.

      We have marked the 14 EDHR consensus genes and indicated the ENaC module in the revised heatmap to aid readability.

      3.12

      Please report exact sample sizes and replicate numbers in all figure legends and provide a single table with all statistical tests, n, and p values.

      We have reported exact sample sizes and replicate numbers in all relevant figure legends and included Table S11 summarising all statistical tests, sample sizes (n), and p values.

      3.13

      A schematic summarising the experimental timelines for ChIP-seq, RNA-seq and viability would help orient readers.

      We have added timelines for these experiments as requested.

      3.14

      Minor copyedits: consistent formatting of O₂, gene symbols and reagent catalogue numbers.

      We have standardised oxygen notation throughout the manuscript to use “oxygen” in the main text and “O2” where appropriate (e.g. figures).

      Reagent catalogue numbers have now been standardised for consistency of presentation in the revised manuscript.

      Gene and protein nomenclature were already formatted according to accepted conventions and were verified for consistency.

      3.15

      The manuscript is well referenced. Where you contrast your findings with long-term CoCl₂ hypoxia, a sentence on why acute DMOG and short-term 1% O₂ may reveal different ERα behaviours would help position the novelty.

      We thank the reviewer for this suggestion. We have expanded the manuscript to clarify that acute hypoxia (1% oxygen) and DMOG treatment capture early, dynamic hypoxic responses, in contrast to chronic CoCl2 exposure, which reflects longer-term adaptation. This distinction is relevant to tumour biology, where hypoxia is often transient due to unstable vascularisation. The following statement has been added to the manuscript:

      “In addition to such chronic hypoxic adaptation, tumour hypoxia can also be dynamic, with cells experiencing acute or transient hypoxic exposure due to unstable vascularisation; an established contributor to tumour progression (Liu et al, 2022a; Koh & Powis, 2012). Thus, in contexts where both signalling pathways remain active, the dependence of the hypoxic response on ERα in ER+ cells has not been previously characterised.”

      Primary Limitations

      3.16

      DMOG vs hypoxia in the cistrome experiment,

      To address concerns regarding the use of DMOG, we have validated key ERα binding events from the ChIP-seq dataset by ChIP–qPCR at the TFF1 and GREB1 loci under bona fide hypoxia (1% oxygen) in biological triplicate__ (Figure 1F)__. These data demonstrate consistent changes in ERα binding under hypoxia, supporting that the DMOG-induced redistribution reflects hypoxia-driven changes.

      3.17

      the absence of direct HIF or cofactor perturbations

      We acknowledge the absence of direct HIF perturbation. To address this, we will assess the contribution of HIF signalling through stabilisation approaches, including RT-qPCR analysis of SCNN1B and SCNN1G ± IOX5 ± fulvestrant (response 3.2), to determine whether HIF activation is sufficient to support ERα-dependent induction.

      3.18

      and the pleiotropy of amiloride.

      To address the potential pleiotropy of amiloride, we will perform siRNA-mediated knockdown of SCNN1G and SCNN1B to provide independent validation of ENaC-dependent effects (response 3.3).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This study explores how hypoxia reshapes ERα signalling in ER-positive breast cancer and whether this cross-talk exposes targetable vulnerabilities. The authors first map ERα binding in MCF7 cells after dioxygenase inhibition with DMOG and observe a genome-wide redistribution with enrichment of ERE, FOXA1 and AP-1 motifs at gained sites while chromatin accessibility at these loci appears unchanged in public ATAC-seq after hypoxia. They then perform RNA-seq in MCF7 and T47D using a factorial design that combines fulvestrant-mediated ERα degradation with 1% O₂ to define an ERα-dependent hypoxia response (EDHR). A 14-gene consensus EDHR signature includes ENaC regulatory subunits SCNN1B and SCNN1G, whose higher expression is associated with poorer RFS in ER+ cohorts. Functionally, amiloride increases viability in normoxia but reduces viability under hypoxia in MCF7 across a dose range. Spatial transcriptomics from ER+ tumours shows EDHR expression enriched at the margins of hypoxia and estrogen-hallmark regions and adjacent to EMT hotspots. Raw data and code availability are stated for the central datasets and accessions are provided. Together the results argue that ERα helps organise a distinct hypoxic programme and suggest a context-specific sensitivity to ENaC inhibition.

      Major comments

      The paper addresses a timely question with a clear narrative arc and brings together ChIP-seq, RNA-seq, pharmacology, survival analysis and spatial transcriptomics. The EDHR concept is interesting and the ENaC angle is original. The work is already strong and with a few targeted additions and clarifications it can be made more persuasive without becoming a new project.

      1) The DMOG ChIP-seq provides a valuable first look at ERα redistribution. Since DMOG inhibits both HIF hydroxylases and oxygen-dependent demethylases, the driver of the observed changes remains ambiguous. It would help to include either ERα ChIP-seq under bona fide hypoxia or a selective PHD inhibitor condition (for example IOX5, as you discuss) to separate HIF stabilisation from broad demethylase inhibition. If ChIP-seq is not feasible, a brief ATAC validation at a small panel of gained and lost loci would still increase confidence. Estimated time: 6-8 weeks for a focused follow up with two conditions and biological duplicates/triplicates.

      2) The factorial RNA-seq is well designed and the attenuation analyses are clear. The EDHR selection is stringent and reproducible across two ER+ lines. To support the claim of ERα dependence mechanistically, a small number of targeted perturbations would go far. For example, confirm EDHR induction for SCNN1B and SCNN1G in hypoxia with and without fulvestrant by RT-qPCR and test whether short-term ERα knockdown reproduces the effect. A complementary test with a HIF-1α or HIF-2α knockdown at one time point would help position EDHR relative to HIF. Estimated time: 3-4 weeks for qPCR and siRNA validations.

      3) The amiloride result is intriguing and consistent with a hypoxia-specific dependency. Because amiloride is pleiotropic, it would strengthen the conclusion to add one genetic and one pharmacological specificity control. A brief SCNN1B or SCNN1G knockdown in hypoxia should phenocopy the viability effect if ENaC contributes. In parallel, testing benzamil at sub-micromolar doses would provide a more ENaC-selective pharmacological readout. These can be performed in MCF7 and, resources permitting, in T47D. Estimated time: 4-6 weeks.

      4) The RFS associations for SCNN1B and SCNN1G are compelling. It would be helpful to report whether the associations persist in a multivariable model that at least includes ER status, grade and nodal status where available, or to state clearly when this is not possible across merged datasets. Even a sensitivity analysis in TCGA with ER+ cases only would contextualise the hazard ratios. Estimated time: 1-2 weeks.

      5) The spatial association of EDHR with EMT hotspots is a nice piece of the story. A short clarification of how spot-level cell type composition was handled will help readers interpret proximity results. If cell type deconvolution scores are available in the source dataset, adding a sentence on whether EDHR enrichment tracks tumour epithelial content would be useful. Estimated time: 1 week.

      Reproducibility and statistics

      Data processing for ChIP-seq and RNA-seq is documented and accessions are provided. The RNA-seq includes n=3 per condition, which is appropriate, and the correlation and LFC analyses are clearly presented. For the amiloride assay, the two-way ANOVA with interaction is appropriate; please add the exact n and whether experiments were independently repeated, and include the underlying values in a source table for transparency. These are small presentational edits rather than new experiments.

      Optional

      A small, hypothesis-driven mechanistic link from EDHR to ENaC function would substantially elevate impact without becoming a long project. For example, testing whether hypoxia increases amiloride-sensitive Na⁺ current in MCF7 and whether fulvestrant abrogates that increase would directly connect the transcriptional and functional observations. If available, patch-clamp or a simple SBFI-based Na⁺ imaging readout could suffice. Estimated time: 6-8 weeks.

      Minor comments

      1. Please show representative ERα ChIP-seq browser snapshots for at least one gained, one conserved and one lost locus alongside input for both conditions.
      2. In Figure 1D, the ATAC-seq comparison uses 0.1% O₂ for 48 h while the RNA-seq uses 1% O₂. Briefly justify the choice and discuss any expected differences.
      3. In the Methods for spatial analyses, specify the thresholds for hotspot calling and how the neighbourhood radius was chosen.
      4. For the EDHR heatmap, consider marking the 14 consensus genes and indicating which belong to the ENaC module to aid readability.
      5. Please report exact sample sizes and replicate numbers in all figure legends and provide a single table with all statistical tests, n, and p values.
      6. A schematic summarising the experimental timelines for ChIP-seq, RNA-seq and viability would help orient readers.
      7. Minor copyedits: consistent formatting of O₂, gene symbols and reagent catalogue numbers.

      Prior studies

      The manuscript is well referenced. Where you contrast your findings with long-term CoCl₂ hypoxia, a sentence on why acute DMOG and short-term 1% O₂ may reveal different ERα behaviours would help position the novelty.

      Significance

      General assessment

      The strongest aspects are the carefully designed factorial RNA-seq that cleanly separates ERα and hypoxia effects, the discovery of a concise EDHR signature reproducible across two ER+ lines, and the integration with spatial transcriptomics that places EDHR near EMT-rich tumour regions. The ENaC connection is new and potentially actionable, and the context-dependent amiloride response is a practical lead. Limitations are primarily mechanistic: DMOG vs hypoxia in the cistrome experiment, the absence of direct HIF or cofactor perturbations, and the pleiotropy of amiloride.

      Advance

      To my knowledge, this is the first description of a distinct ERα-dependent hypoxic programme in ER+ breast cancer that includes ENaC regulatory subunits and links to an EMT-adjacent spatial niche. The conceptual advance is the positioning of ERα as a coordinator of a subset of hypoxia-induced genes rather than as a parallel pathway, together with an initial functional readout that suggests a therapeutic angle through ENaC modulation. With the targeted additions outlined above, the study would move from strong association to a more mechanistic and translationally relevant model.

      Audience

      The work will interest a specialised audience in nuclear receptor biology, hypoxia signalling, tumour microenvironment, and ion transport in cancer. It has potential relevance for basic researchers studying ERα cistrome dynamics, for groups using spatial transcriptomics to define micro-niches, and for translational researchers exploring metabolic and ionic vulnerabilities in ER+ disease.

      Expertise disclosure

      Keywords: nuclear receptors,, chromatin profiling, transcriptomics, spatial transcriptomics, breast cancer biology.

      I am not a domain expert in ion channel electrophysiology; my comments on ENaC pharmacology focus on specificity and study design rather than detailed channel biophysics.

      Tone

      I find the paper well conceived and already compelling. The suggested experiments are focused, realistic in scope, and primarily aim to turn several strong associations into concise mechanistic statements that would further increase confidence and impact.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      ERα drives most luminal breast cancers. However, how hypoxia reshapes ERα activity and how ERα itself might influence the hypoxic response remain unclear. Understanding this interaction is crucial, as hypoxia is strongly linked to endocrine resistance and poor outcomes. In this study, authors investigated how hypoxia modifies ERα signalling in ER+ breast cancer and whether ERα contributes to the transcriptional response to low oxygen. Using MCF7 and T47D cells, they combined genome-wide profiling of the ERα cistrome under DMOG, hypoxic transcriptomics with or without ERα degradation, and spatial transcriptomics in tumours. This revealed an ERα-dependent hypoxic response (EDHR), prominently involving regulation of epithelial sodium channel (ENaC) subunits, whose expression requires both hypoxia and active ERα signalling. Functionally, ENaC inhibition with amiloride reduced cell viability under hypoxia. Together, these findings uncover a previously unrecognised ERα-dependent layer of the hypoxic transcriptome and identify ENaC as a potential therapeutic vulnerability in hypoxic ER+ breast cancer. Although the study is interesting, the manuscript lacks several essential functional and experimental validations. ENAC is proposed as a therapeutic vulnerability based on amiloride sensitivity assays. Additional experiments are required, such as western blot validation of ENaC regulation under hypoxia and loss-of-function approaches to assess its contribution to the phenotype. Fulvestrant is used to dissect ERa dependency. However, as a SERD, it may alter chromatin and transcription independently of a simple loss of ERα. Addition control would strengthen interpretation. The molecular mechanism by which ERα modulates the hypoxic transcriptome, specifically how ERα and HIF pathways converge at ENAC loci should be more studied. In addition, to assess the relevance of this work for luminal breast cancer and ERα expression, specific validation in TNBC should be performed Finally, the authors should provide RT-qPCR validation of the key EDHR genes, especially since this signature is later used for downstream analyses.

      Significance

      General assessment strengths:

      This study uncovers a previously unrecognised ERα-dependent hypoxic response in breast cancer, revealing that ERα actively shapes the hypoxic transcriptome rather than functioning as an isolated pathway. To me, the main strength of this work is the identification of ENaC as a novel hypoxia-specific therapeutic vulnerability in ER+ breast cancer, suggesting that ion-channel regulation may play a broader and underappreciated role in endocrine resistance.

      Limitation:

      Reprogramming of the ERα cistrome under cellular stress is well documented. The study extends these ideas but does not clearly demonstrate a new mechanistic paradigm, particularly because the EDHR is defined primarily through omics approaches without strong mechanistic validation. In addition, we have to keep in mind that the study uses DMOG to model hypoxia-driven chromatin changes, but DMOG inhibits many 2-oxoglutarate-dependent dioxygenases non-selectively. This makes it difficult to attribute ERα cistrome reprogramming specifically to hypoxia, rather than to broad off-target effects. The transcriptomic dataset is more convincing by need the validation suggested previously.

      Audience:

      Given its reliance on omics datasets and preliminary functional assays, the paper will likely appeal to a specialized audience in transcriptional regulation, hypoxia signalling, and ER+ breast cancer biology. However, the limited mechanistic depth and uncertain translational relevance due to the lack of in vivo validation, may reduce its impact for broader oncology or therapeutic-development audiences. Without stronger validation, the findings may be perceived as niche and mainly of interest to researchers focused on ERα chromatin dynamics rather than to the wider cancer research community.

      Expertise:

      My evaluation is based on my background in breast cancer, ERα signaling and breast tumorigenesis. However, I have limited expertise in spacial transcriptomic analyses and advanced CHiP-seq bioinformatic analyses, which may affect my assessment of some computational analyses.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Malcom et al. present evidence that, under hypoxic conditions, hypoxia-inducible factors (HIFs) alter the estrogen receptor alpha (ERα) epigenomic landscape in a model of estrogen receptor-positive (ER+) breast cancer (BrCa). The response of ER+ BrCa to estradiol (E2) in MCF7 (ER+) cells, as well as ERα signaling in both primary and metastatic breast cancer, has been well studied, and the epigenomic landscape of ERα+ BrCa is well documented. The differentially expressed genes (DEGs) identified under treatment with the hypoxia mimetic dimethyloxalylglycine (DMOG) revealed a subset of ERα-dependent hypoxic response (EDHR) genes. The outcome was a reprogramming of the basal ERα cistrome, coinciding with sites enriched for estrogen response elements (EREs) and co-transcription factor binding motifs for ERα, including FOXA1 and AP-1. This was demonstrated by ERα ChIP-seq (i.e. DMOG) and ATAC-seq (i.e. 1% O2) performed under different hypoxic conditions. The transcripts identified following DMOG treatment were leveraged and compared to publicly available RNA-seq datasets from various breast cancer subtypes exposed to 1% hypoxic oxygen. Although the comparison methods varied, the results suggested that BrCa cell lines under 1% hypoxic oxygen conditions showed strong similarity to MCF7 cells treated with DMOG. Genes upregulated in response to DMOG correlated with poorer survival outcomes. To demonstrate the requirement for ERα in this model, MCF7 cells were treated with the selective estrogen receptor degrader (SERD) fulvestrant-the only FDA-approved SERD for ER+ BrCa-showing a dampening of the HIF response among EDHR genes. This suggests that ERα is necessary for the expression of DEGs under hypoxic conditions induced by DMOG. Finally, the sodium channel protein ENaC subunits (i.e., SCNN1B and SCNN1G) were further characterized as candidate EDHR genes. Analyses of publicly available datasets indicated that high mRNA expression levels of these subunits were associated with worse survival outcomes, supporting the clinical relevance of EDHR genes SCNN1B and SCNN1G. To further validate clinical relevance, utilize the Spatial Transcriptome in a small subset of ER+ BrCa.

      Major:

      1. Overall, the study lacks well-controlled experiments comparing hypoxia induced by DMOG with hypoxia induced by 1% O₂ for assessing ERα occupancy throughout.
      2. Lack of evidence for other co-transcription factors impact under hypoxia HIF's in Fig1.
      3. Lack of evidence for DMOG induce HIF protein expression in MCF7 cells.
      4. Figure 1: ATAC-seq was performed under 1% O₂, whereas ChIP-seq was conducted with DMOG treatment, making these conditions not directly comparable.
      5. Figure S1: ERα ChIP lacks estradiol (E2) treatment in MCF7 cells with or without DMOG. The single-gene examples of DMOG effects shown in Fig. S1A are not significant.
      6. Figures S2 and S3: Fig. S2 lacks 1% O₂ conditions, and Fig. S3 lacks DMOG-induced HIF factor assessments.
      7. Figure S4: Estradiol (E2) treatment is missing from the controls, and the figure labeling is of poor quality. Hypoxic conditions for assessing ER status and appropriate controls are also lacking.
      8. Figure S5: The description of fulvestrant treatments under hypoxic conditions is unclear.
      9. Supplemental legends: These require major revision; they are of poor quality and lack statistical details and references to biological replicates.

      Minor:

      1. Overall comparisons throughout the manuscript are weak; the figures appear sloppy and lack sufficient effort in presentation.

      Significance

      In general, the manuscript in its present form does not greatly contribute from published work as the ERα cistrone is well documented work studied for its role in regulating gene expression, particularly in ERα-positive breast cancer. Additionally, a lack of a through comparison between DMOG and or 1 %O2 induce hypoxia in the MCF7 ER+ model, diminished initial interest in the manuscript. The lack of considering estradiol exposure under hypoxic conditions with either 1%O2 and or DMOG also limits relevance to patients with ER+ BrCa. The ERα epigenomic profile has been extensively studied including work under hypoxic conditions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank to all reviewers on their careful consideration of our manuscript. We highly appreciate their thoughtful comments and suggestions, that helped us to improve the quality of our work. We address each comment point-by-point below.

      2. Description of the planned revisions

      __Reviewer #1 __

      Minor comments:

      Figure 5 would be more informative if it included more higher magnification images that would reveal the staining at the cellular level.

      To fulfil the suggestion, we will perform a new round of immunostaining followed by high-resolution confocal imaging. This requires additional time for laboratory work.

      __Reviewer #2: __

      Major comments

      1d. The authors tried to attribute the minor phenotype to the incomplete depletion of S100A4+ cells. However, it is possible that if the S100A4+ cells only represented a minor population, their function may be compensated by other populations. This might be confirmed by quantification of S100A4+ cells or S100A4-Cre; GFP+ cells in fibroblast or CD45 populations from images showed in Figure 5.

      We will address this comment by performing required quantifications.

      Moreover, we have now included data on the presence of S100A4+ cells in S100a4-Cre;DTA mice (Figure for Reviewers 5a,b; Supplementary Figure 7a,b in the revised manuscript), which demonstrate incomplete depletion of the S100A4+ cells in the nipple and the mammary gland. This is likely due to ongoing tissue remodeling and continuous S100A4+ replenishment/ supply. Another study using the same S100a4-Cre;DTA mouse model reported an efficient S100A4+ cell depletion in mandibular condyle (Tuwatnawanit et al., 2025), which suggests that the presence of S100A4+ cells in the S100a4-Cre;DTA mammary gland and nipple is due to tissue-specific dynamics rather than lack of depletion efficiency.

              We have included in Discussion: “Notably, we observed incomplete depletion of S100A4+ cells in the mammary gland and nipple. Interestingly, a study using the same S100a4-Cre;DTA mouse model reported complete S100A4+ cell depletion in the superficial layer of mandibular condyle46. This suggests that incomplete depletion of S100A4+ cells in nipple and mammary gland is due to tissue-specific dynamics, rather than lack of depletion efficiency, indicating a compensatory mechanism that can balance the cell loss.”
      

      The images in Figure 5 and Figure S4 are difficult to confirm colocalization. A higher magnification image would be required for each panel. Furthermore, a precise quantification based on the current images would be more supportive of the conclusion regarding the discrepancy of the composition of S100A4 lineage between epidermis and mammary gland (lines 163-165).

      To address this comment, we will perform a new round of immunostaining and high-resolution confocal imaging and quantifications and include the results in the fully revised manuscript.

      Line 163, the author hypothesis the Langerhans cells due to morphology. Those cells should be able to be confirmed by a co-staining with F4/80 in addition to the current form of Fig 5h.

      To address this comment, we will perform co-staining of GFP and F4/80 (or, eventually, AIF1, depending on antibody availability) and include the results in the fully revised manuscript.


      Reviewer #3

      Minor comments

      Figure 2c: The H&E images are not fully convincing. Immunofluorescence analysis of epithelial architecture would support the authors' interpretation and should be feasible if tissues are already available.

      We will perform immunostaining for epithelial markers, such as keratins, and include the results in the fully revised manuscript.

      Figure 4f: The proliferation data are compelling, but the authors could extend this by examining how cell differentiation and epithelial organisation are affected.

      We will perform immunostaining for epithelial markers (keratins, αSMA) and include the results in the fully revised manuscript.

      Figure 5b: To more convincingly show that GFP+ cells contact endothelial cells, co-labelling with an endothelial marker such as CD31 would be helpful.

      We will perform the requested co-labeling of GFP and CD31 and include the results in the fully revised manuscript.

      Figure 5f-h: The structures referenced in the text (lines 159-163) should be clearly indicated on the immunofluorescence images.

      We will incorporate these explanations into the new, high-resolution/detailed Figure 5 in the fully revised manuscript.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1:

      Major comments

      1. It is rather difficult to conclude whether the observed nipple phenotype reflects an early embryonic/prepubertal defect in establishing the nipple stroma, is caused by a constitutive response to ongoing cell death, or a response to continuous DTA expression (or a combination of some of these).

      The data raise a couple of additional questions: Is there a nipple phenotype at 3 wk of age? It would not be totally unsurprising if ablation of a major fraction of dermal fibroblasts in the nipple area would lead to an early embryonic/prepubertal phenotype but there is no data on this. Hence, is there a "congenital" nipple deformity, as concluded by the authors (line 191)?

      We appreciate the reviewer’s insightful comments. We have now included data on embryonic nipple development. These data demonstrate abundant S100A4-lineage cells in E15.5 and E18.5 skin of S100a4-Cre;mT/mG embryos (Figure for Reviewers 1a, corresponding to Figure S3a in the revised manuscript) and normal appearance of nipple sheath in S100a4-Cre;DTA embryos at E18.5 (Figure for Reviewers 1b, corresponding to Figure S3b in the revised manuscript), suggesting no embryonic defect.

      Unfortunately, we cannot provide data on 3-weeks old mice (we have not collected this timepoint previously and currently we do not have this mouse line alive). Instead, however, we provide in situ pictures of DTA and S100a4-Cre;DTA nipples at 7 weeks of age (Figure for Reviewers 1c; Figure S3c in the revised manuscript), which demonstrate that the phenotype of defective nipple is fully established at this timepoint. Because the late embryonic data did not support the “congenital” establishment of the nipple deformity and we could not provide any more data from early postnatal development, we have corrected the statement “we describe a congenital nipple deformity” in the discussion to “we describe a nipple deformity”.

      Are there S100a4+ cells in the nipple area of pubertal S100a4-Cre/DTA mice? I.e. is there a continuous supply of new S100a4+ cells and thereby continuous cell death and DTA expression as one might expect based on the RNA-seq data?

      The S100A4+ cells are present in the nipple area of S100a4-Cre;DTA mice, suggesting a continuous supply of new S100A4+ cells (Figure for Reviewers 1b, corresponding to Figure S3b in the revised manuscript; and Figure for Reviewers 5a,b, corresponding to Figure S7a,b in the revised manuscript). In the revised manuscript, we comment on this in Discussion: “Notably, we observed incomplete depletion of S100A4+ cells in the mammary gland and nipple. Interestingly, a study using the same S100a4-Cre;DTA mouse model reported complete S100A4+ cell depletion in the superficial layer of mandibular condyle46. This suggests that incomplete depletion of S100A4+ cells in nipple and mammary gland is due to tissue-specific dynamics, rather than lack of depletion efficiency, indicating a compensatory mechanism that can balance the cell loss.”

      Figure for Reviewers 1 (Figure S3 in the revised manuscript): Embryonic and pubertal nipple phenotype. (a) Representative images of cleared whole-mount S100a4-Cre;mT/mG nipple tissue at embryonic developmental time-points: E15.5 and E18.5. Scale bar = 100 µm. (b) Immunofluorescent labeling for S100A4 on embryonic DTA and S100a4-Cre;DTA whole-mount skin (E18.5). Scale bar = 100 µm. (c) Representative in situ photographs of nipples from DTA and S100a4-Cre;DTA pubertal (7-weeks old) mice. Scale bar = 1 mm.

      The subtitle on line 54 implies that that S100a4-Cre/DTA mice display a branching phenotype. However, it looks to me as if there is a pubertal outgrowth defect (as is also written in the body text, line 64) rather than a branching phenotype, potentially reflecting the much smaller size of S100a4-Cre/DTA mice (Fig. 2a). Unless there is a change in branch point frequency, I suggest rephrasing the title and discussion. Instead, I suggest the authors discuss the observed outgrowth delay considering the gross overall growth defect (Fig. 2a). If ductal outgrowth was normalized to the overall growth defect, would one still observe 'a delay in branching morphogenesis'?

      We apologize for the section title confusion. We have analyzed branching frequency in 7-weeks-old females and observed reduced total number of branching points in S100a4-Cre;DTA mice (Figure for Reviewers 2a, corresponding to Figure 2f in the revised manuscript). A significant difference in number of branching points remained also after their normalization to body weight, (Figure for Reviewers 2c, corresponding to Figure 2h in the revised manuscript). We have now added the new quantifications to the revised manuscript with accompanying descriptions in the main text “Analysis of mammary epithelial development using whole-mount carmine staining revealed no significant differences in the prenatal establishment of the mammary epithelial tree but did reveal significantly delayed epithelial outgrowth and reduced branching in pubertal (7 weeks old) S100a4-Cre;DTA mice (Figure 2e,f). Normalization of epithelial outgrowth and branching to body weight indicates that the observed defect represents a mammary-specific impairment rather than a consequence of reduced body growth (Figure 2g,h).”.

      __Figure for Reviewers 2 (Figure 2 in the revised manuscript): __Pubertal branching morphogenesis is delayed in S100a4-Cre;DTA. (a-c) The plots show total number of branching points (a), epithelial outgrowth [mm] normalized to body weight [g] (b), and total number of the branching points normalized to body weight [g] (c) in 7 weeks old DTA and S100a4-Cre;DTA mice. All plots show the mean ± SD, *p

      Fig. 4e shows Masson's Trichrome and Picrosirius Red staining and the authors report the findings as follows (lines 120-124): "collagen fibers were loosened in the DTA nipples and more densely packed in the S100a4-Cre;DTA nipples". Perhaps the authors could help non-specialists to observe the loosened fibers and if they wish to make quantitative statements ("more densely packed"), such statements should be backed-up by quantifications.

      Picrosirius Red staining viewed under polarized light is a classic way to assess collagen organization, thickness, and packing. Red / orange / yellow color typically marks thicker, more mature, and more tightly packed collagen fibers (often associated with type I collagen), while green color usually marks thinner, less organized, or less densely packed fibers (often associated with type III collagen or immature collagen). We had included this explanation in the Figure legend of the submitted manuscript already: “Typically, thicker collagen fibers exhibit stronger birefringence and appear red or orange, while thinner fibers exhibit weaker birefringence and appear green or yellow.” To help with the quantification, we have extracted the red channel and quantified color intensity. The results are shown in Figure for Reviewers 3, corresponding to Figure S4 in the revised manuscript. Moreover, we will also quantify the differences in pattern of the collagen fibers. The fibers in DTA nipples look shorter and more curved, while the fibers in S100a4-Cre;DTA nipples look longer and straighter, more aligned. The results will be included in the fully revised manuscript.

      Figure for Reviewers 3 (Figure S4 in the revised manuscript): Collagen fibers are densely packed in S100a4-Cre;DTA nipples contain more . (a) Representative pictures of histological sections of DTA and S100a4-Cre;DTA stained for collagen by Picrosirius red. Polarized light images and the red channel (mature/densely packed collagen) are shown alongside detail pictures of selected regions A and B. Scale bar = 200 µm and 100 µm (in detail pictures). (b) Quantification of Intensity Mean Value for the red channel (densely packed collagen), showing statistically non-significant difference. The plot shows the mean ± SD, ns p > 0.05 (Mann-Whitney test), n = 3 DTA / 4 S100a4-Cre;DTA.

      I found the Discussion on the various mouse models somewhat problematic. Overall, the paper is written is a way that it often remains unclear whether it refers to studies addressing the role of S100a4 itself, studies addressing the function of S100a4+ cells via ablation approaches (S100a4-Cre or S10 0a4-CreERT2 crossed with floxed DTA), or those where S100a4-Cre has been used to delete gene X/Y/Z. These are all very different experimental approaches where one approach is not necessarily informative when trying to understand the results from another one. The authors should make these points clear and consider whether all their discussion points are relevant.

      We apologize for the confusion. We have carefully reviewed the references and their interpretations, and corrected them as necessary.

      The abstract states S100a4 (fibroblast-specific protein 1) is "expressed by mesenchymal cells and has been implicated in the development of eccrine glands, hair follicles, and mammary branching morphogenesis". However, the study on eccrine glands (ref. 19) shows that S100A4+ cells play a role in eccrine gland development but it does not address the role of S100a4 itself, while the study on hair follicles (ref.20) in turn reports the expression pattern of S100a4 in hair follicles but does not address its function, nor the role of S100a4+ cells. Finally, I failed to find references in the paper to studies addressing the role of S100a4, or S100a4+ cells in the mammary gland.

      Instead, the paper had references to studies where S100A4-Cre had been used to delete different genes and these mice had various mammary phenotypes - which, as indicated above, is a very different approach compared to deleting S100a4 or ablating S100a4+ cells.

      Thank you for your comment. We addressed the concern in the Abstract and further in the Discussion. We revisited the present the cited studies more carefully, clearly distinguishing the different approaches and particular findings.

      In our literature review, we also considered studies that used S100a4-Cre mouse model, to manipulate gene expression within S100A4+ cells. We believe that these studies bring indirect evidence of S100A4+ cell involvement in development and/or homeostasis of a tissue, such as mammary gland. Please, find the rephrased part of Abstract in the text, and below:

      “S100A4 (S100 calcium binding protein A4, also known as fibroblast-specific protein 1) is expressed by mesenchymal cells and has been associated with hair follicle regeneration. S100A4-expressing cells have been implicated in the development of eccrine glands, and studies using S100a4-Cre to manipulate gene function have suggested that S100A4-expressing cells may contribute to mammary branching morphogenesis.”

      __In Discussion (lines 197-200), __the authors write: "We described significant delay in mammary branching morphogenesis in puberty, confirming an important role for S100A4+ cells in mammary development, as it was previously described (refs 37-39)."

      It should be noted that none of these studies addressed the role of S100A4+ cells:

      • Ref 37 used S100a4-Cre to delete sharpin

      • Ref 38 used the same Cre line to delete Ptch1, did not address the role of S100a4 or S100a4 expressing cells

      • Likewise ref 39 deleted another gene using S100a4-Cre

      Later on in Discussion, the authors compare the reported phenotype to previous studies (lines 248-255): "...targeting S100A4+ cells through knockout experiments can result in severe phenotypes, such as a reduction in adipose tissue (ref 26), skin phenotypes, a disrupted estrous cycle, reduced fertility (ref. 38), and complete infertility, hypogonadism and defects in pituitary endocrine function (ref. 28).

      Of these, Ref. 26 used the same approach as the current study (S100a4-Cre; DTA) (Fig. 7A in the paper)

      • these mice were significantly lean, with markedly reduced fat compared with the control mice - also the mice in the current study are very small, so perhaps they could also be described as 'lean'. Yet ref. 26 reports that female mice had comparable food uptake, respiratory exchange ratio and physical activity, and slightly increased energy expenditure

      Ref. 38 (as mentioned above) reports deletion of Ptch1 using S100a4-Cre lines and these mice "displayed a disrupted estrous cycle and dramatically reduced fertility over 6.5 weeks". However, this has nothing to do with the approaches where Fsp1/S100a4+ cells are depleted with DTA. Likewise, reference 28 analyzed the phenotype of S00a4-Cre;Ptch1fl/fl mice. Obviously, deleting Ptch1 using S100a4-Cre mice is quite a different approach than "targeting S100A4+ cells" through knockout experiments". Ptch1 deletion leads to a combination of gain-of-function (of Hedgehog activation) and loss-of-function (loss of Hh-independent functions of Ptch1) and hence comparisons with these phenotypes is rather challenging. I suggest the authors focus their phenotype comparisons to ref. 26 where S100a4/Fsp1+ cells were ablated with DTA, i.e. the same approach as in the current study.

      Please, find the rephrased part of Discussion in the text (lines 236-256), and below:

      “A key consideration when interpreting studies involving S100A4 is that fundamentally different experimental approaches have been used to investigate its role. These include descriptive analyses of S100A4 expression, functional studies targeting the S100A4 protein itself, genetic models using S100a4-Cre to manipulate unrelated genes in S100A4-expressing cells, and ablation models such as S100a4-Cre;DTA, which deplete S100A4⁺ cells. These approaches are not equivalent and provide distinct types of information. In the present study, we specifically assess the consequences of ablating S100A4-expressing cells, and comparisons to other studies should therefore be interpreted within this context.

      Studies using S100a4-Cre to manipulate specific signaling pathways (e.g. Wnt or Hedgehog signaling via gene deletion) in S100A4-expressing cells have reported diverse phenotypes, including effects on fertility and endocrine function28,34. However, these phenotypes primarily reflect the consequences of pathway perturbations within S100A4-expressing cells rather than the role of S100A4⁺ cells themselves. This is fundamentally different from the ablation approach used here, which removes the S100A4⁺ cell population.

      In contrast, studies employing S100a4-Cre–driven DTA–mediated ablation represent a directly comparable approach. Such studies have reported systemic phenotypes, including reduced adipose tissue and altered metabolic parameters26, indicating that S100A4-expressing cells contribute to multiple aspects of tissue homeostasis. Consistent with these previous reports, S100a4-Cre;DTA mice used in our study were significantly smaller than their littermates. Our findings extend these observations by identifying a specific and previously unrecognized role for this cell population in nipple morphogenesis.”

      I find the Discussion is somewhat off the topic by starting with WHO recommendations on breastfeeding and linking this to observed mouse phenotype. Overall, the discussion is rather long and from time-to-time more like a literature review. I would recommend keeping the Discussion more succinct and focused.

      To improve the conciseness and focus of Discussion, we have deleted this part of text.

      **Referee cross-comenting**

      I agree with the comments of other reviewers. However, to me it seems that the analysis of S100a4 knockout mice would not be feasible within a reasonable timeframe and would represent a study of its own. My understanding was that the authors were not interested in S100a4 itself. Rather, S100a4-Cre was used as a tool to understand the importance of a certain (fibroblast) cell population for mammary gland morphogenesis.

      Indeed, our goal was to study the role of a specific cell population (S100A4+ cells) in mammary gland morphogenesis, not to study the role of S100A4 protein per se.

      Reviewer #1 (Significance (Required)): General assessment:

      This study reveals the importance of the S100a4+ cell lineage for nipple formation while showing the same cells are dispensable for mammary gland morphogenesis. The main limitation is that it remains unclear whether the observed nipple phenotype is derived from an early embryonic/prepubertal defect in establishing the nipple stroma, is caused by a constitutive response to ongoing cell death, or a response to continuous DTA expression (or a combination of some of these). Hence its relevance as a model of human inverted nipple condition remains rather speculative.

      Thank you for consideration of our work and valuable feedback. We did not intend to claim that S100a4-Cre;DTA mouse represents a model of human inverted nipple condition. However, considering morphological features, it might resemble it. We now rephrased the Discussion so it is clearer and more concise.

      Reviewer #2

      Major comments:

      1. My key concern is the discussion part. I think the authors need to re-organize/re-phrase the discussion part, it confused me a bit in terms of logic, phrases and interpretation of literatures.

      We have significantly re-organized and re-phrased the Discussion.

      Here are few examples:

      1. The lines 195-199 contain lot of repeated information

      We have rephrased the paragraph and removed repeated information. The new text can be found in lines 201-206 in the revised manuscript.

      1. The authors mentioned the studies in ref 26,28 and 38 using "targeting S100A4+ cells through knockout experiment can result in sever phenotypes". This is very misleading. Those studies using the same (or similar if the origin is different) S100A4-Cre line as the current study but induced the activation of Wnt and sHH signalling pathways, respectively. The observed phenotypes are largely due to the pathway function, rather than the S100A4 gene or normal S100A4+ cell itself. This is significantly differed from the current study.

      We apologize for the confusion; we have now rephrased our claims (lines 236-256):

      “A key consideration when interpreting studies involving S100A4 is that fundamentally different experimental approaches have been used to investigate its role. These include descriptive analyses of S100A4 expression, functional studies targeting the S100A4 protein itself, genetic models using S100a4-Cre to manipulate unrelated genes in S100A4-expressing cells, and ablation models such as S100a4-Cre;DTA, which deplete S100A4⁺ cells. These approaches are not equivalent and provide distinct types of information. In the present study, we specifically assess the consequences of ablating S100A4-expressing cells, and comparisons to other studies should therefore be interpreted within this context.

      Studies using S100a4-Cre to manipulate specific signaling pathways (e.g. Wnt or Hedgehog signaling via gene deletion) in S100A4-expressing cells have reported diverse phenotypes, including effects on fertility and endocrine function28,34. However, these phenotypes primarily reflect the consequences of pathway perturbations within S100A4-expressing cells rather than the role of S100A4⁺ cells themselves. This is fundamentally different from the ablation approach used here, which removes the S100A4⁺ cell population.

      In contrast, studies employing S100a4-Cre–driven DTA–mediated ablation represent a directly comparable approach. Such studies have reported systemic phenotypes, including reduced adipose tissue and altered metabolic parameters26, indicating that S100A4-expressing cells contribute to multiple aspects of tissue homeostasis. Consistent with these previous reports, S100a4-Cre;DTA mice used in our study were significantly smaller than their littermates. Our findings extend these observations by identifying a specific and previously unrecognized role for this cell population in nipple morphogenesis.”

      1. In the lines 253-255, why the author believe complete S100A4+ depletion would leads to the fatal of mouse? Is there study suggest that? Or have authors checked the expression of S100A4 in the S100A4-Cre;DTA model to confirm the efficiency?

      We have now included, also in response to other Reviewers’ comments, data on S100A4 expression in the S100A4-Cre;DTA model (Figure for Reviewers 5, corresponding to Figure S7 in the revised manuscript), and commented on these results in lines 257-262: “Notably, we observed incomplete depletion of S100A4+ cells in the mammary gland and nipple. Interestingly, a study using the same S100a4-Cre;DTA mouse model reported complete S100A4+ cell depletion in the superficial layer of mandibular condyle48. This suggests that incomplete depletion of S100A4+ cells in nipple and mammary gland is due to tissue-specific dynamics, rather than lack of depletion efficiency, indicating a compensatory mechanism that can balance the cell loss.”

      In Fig. 1, the authors described the impaired nursing capacity of S100A4-Cre;DTA dam. However, it seems the little size is also smaller (Fig 1a). Do authors have any explanation or hypothesis?

      Thank you for this insightful observation. It is well established that metabolic and nutritional condition directly affect female reproductive functions. Adult S100A4-Cre;DTA mice are generally smaller compared to their litter counterparts, potentially because of lower body fat content or other anatomic/metabolic condition that might negatively influence fecundity, for instance, lowering ovulation rate and/or embryonic survival. In support of this, earlier studies have reported a positive correlation between growth rate/body condition and litter size (Eisen & Durrant, 1980). Unfortunately, in the case of S100A4-Cre;DTA mice, we can only speculate about the possible explanations, as we do not have supporting data which could confirm it.

      In lines 181-184, the authors states "the results showed that the tissue reacted to a foreign chemical or an endogenous compound....." , which results are referring here? I could not find any inflammation related GO terms in figure 6b. It would be more accurate to specify them in lines 179-181, which appears to be a technical statement rather than a result in current form.

      Thank you for this comment. Indeed, there are no GO terms explicitly labeled as “inflammation” and “repair”; however, several GO terms are functionally related to these processes. Our interpretation was based on broader biological context rather the explicit annotation. To clarify this, we revisited the text and included GO terms that reflect the tissue response (lines 187-193).

      “The GO terms indicated that the tissue reacted to a foreign chemical or an endogenous compound (xenobiotic metabolic process, cellular response to xenobiotic stimulus, response to xenobiotic stimulus, epoxygenase P450 pathway), and responded to inflammation and repair (actin filament-based process, actin cytoskeleton organization; eicosanoid and lipid metabolic processes) (Figure 6b).”

      The lines 182-184 was not clear. Does the author refer the "nipple tissue response" in general as malfunction of development or inflammation and tissue repair as mentioned in the previous sentence? If the later cases, the authors should consider the failure of lactation might mimic the involution, which may cause the apoptosis and inflammation as well. This might be independent of the DTA expression.

      Thank you for raising this point. Indeed, in this line, we refer to ongoing tissue inflammation and repair. We also considered the hypothesis that the ejection incapability (and consecutive milk stasis) triggers involution. However, tissues were collected within a few hours after parturition, when only very early signs of involution, if any, would be detectable; therefore, we expect minimal influence of involution. To reflect this comment, we added new text to the Discussion (lines 272– 277). “The observed tissue response can be also associated with hallmarks of mammary involution, the process which is triggered by the milk stasis. However, the tissues were collected within few hours after parturition, when the effect of involution should be minimal53. Rather, we hypothesize that immune cell recruitment, and the upregulation of the lipid skin barrier might be caused in response to the continuous apoptosis of S100A4+ cells and their replacement.”

      Minor comments:

      1. The authors demonstrated in Figure S1 and lines 92-96 that no significant differences were observed in pituitary glands and ovaries in S100a4-Cre:DTA and DTA mice. Have the authors checked the S100A4 expression or lineage cells in these organs, or have been reported by others?

      Yes, we checked the S100A4-lineage cells in the pituitary gland and ovary and have now included the results here (Figure for Reviewer 4a,b corresponding to Figure S1a,b in the revised manuscript), along with relevant text description (lines 94-95 in the revised manuscript). “We observed S100A4-lineage traced cells in pituitary gland and ovaries using S100a4-Cre;mT/mG model (Figure S1a,b).” The presence of S100A4+ cells in these organs was also reported previously (Ren et al., 2019).

      Figure for Reviewers 4 (Figure S1 in the revised manuscript): S100A4-lineage cells are abundant in the pituitary gland and ovary. (a) Representative images of a cleared whole-mount pituitary gland from a S100a4-Cre;mT/mG mouse. (b) Representative images of a cleared whole-mount ovary from a S100a4-Cre;mT/mG mouse. Scale bar = 100 µm.

      The authors have performed live imaging to evaluate the contraction of alveoli. It would be better to include a video together with the snapshots showed in Figure S2.

      We have included the videos as supplementary movies, Movie S1 (DTA) and Movie S2 (S100a-Cre;DTA).

      Since the study is mainly using S100a4, it would be better to avoid using FSP1 in the results, for example Fig 5h.

      We apologize for this oversight; it has now been corrected.

      What does L1 stand for? Lactation Day 1? It should be spelt out in the first instance.

      Yes, indeed, L1 is lactation day 1. Please note that it was already spelled out in the first version of the manuscript, now in line 48.

      Line 150. Figure S4 should be Figure S4a.

      (Please note, that by adding new Supplementary figures, this comment is referring to Figure S6 in the new version of manuscript.) Thank you for this comment. In the text, we state “GFP+ cells were spread throughout the fat pad but were also localized in the periepithelial stroma and infiltrated the epithelium”. This we show in Figure S6a and in S6b; therefore, we now changed the reference accordingly, as it might be more accurate.

      **Referee cross-comenting**

      I agree with the other reviewers, as well as the Consultation Comments. The manuscript would benefit greatly from a thoroughly optimised Discussion section to address issues raised by all reviewers.

      __ Reviewer #2__ (Significance (Required)):

      • Overall, this study is well designed and the key findings are valid, especially the role of S100A4 during nipple development is novel and interesting.

      -One limitation of the study is that RNA-seq was performed using a mixture of all cell types present in the nipple. While this approach is reasonable-given that depletion of the S100A4+ lineage may exert both direct and indirect effects contributing to nipple dysfunction-it should be more clearly acknowledged and discussed in the manuscript. Additionally, this experimental design may limit the utility of the dataset for other researchers interested in nipple development and the specific functions of S100A4.

      Reviewer #3

      Major comments:

      2) The differential systemic versus mammary-specific effects of DTA-mediated S100A4 cell ablation are intriguing. The authors should address why the mammary fat pad appears unaffected.

      Thank you for this comment. The role of S100A4+ cells in adipose tissue was previously reported (Zhang et al., 2018). Authors reported significantly smaller adipose tissue of S100a4-Cre;DTA mice (males and females), measured as the weight of the dissected fat pad. In our work, we measured the in-situ area of the fat pad, which appeared to be unaffected. It is possible that the volume (weight) of the fat pad would be different, however we do not have data to confirm / reject this hypothesis.

      Are S100A4 expressing cells present during embryonic mammary development, or are they mainly postnatal? Would an inducible S100A4CreERT model lead to similar phenotypes, or might the timing of depletion influence the outcome? Discussing these points would reinforce the conclusions regarding the contribution of S100A4-expressing cells to mammary and nipple development and could also clarify the transient nature of the ductal branching phenotype.

      S100A4-expressing cells are present during embryonic mammary development, too. Please, refer to the embryonic lineage-tracing time-points incorporated in the first version of the manuscript (Figure 5a and Figure S6a). Now, we have added Figure for Reviewers 1 corresponding to Figure S3 in the revised manuscript), which focuses on the embryonic nipple phenotype but also provides information on the presence of S100A4+ cells.

      We agree that the use of inducible S100a4-CreERT model could potentially bring new insights toward developmental stage-specific roles of S100A4+ cells, and thus would be interesting to use in a follow-up study. Currently, such experiments are beyond our capacity.

      Therefore, we have included a new subsection on Limitations of the study, where we comment:

      “A major limitation of this study is that the timing of DTA-mediated cell depletion cannot be precisely defined in the constitutive mouse model employing S100a4-Cre because recombination may occur continuously following the initial expression of S100a4 (E8.518). This limitation could be overcome by usage of inducible S100a4-CreERT instead. With this approach, it could be more feasible to determine if the nipple deformity arises as a defect of embryonic development or postnatal morphogenesis.”

      3) Although the authors attribute lactation failure primarily to defects in nipple architecture, the RNA seq data reveal downregulation of key milk production genes and luminal differentiation keratins, strongly suggesting impaired secretory activation. The authors should more explicitly discuss the relative contributions of epithelial functional maturation defects versus nipple structural abnormalities to the lactation failure observed upon S100A4+ cell depletion. Thank you for this comment. We believe that performing an immunofluorescence labeling of epithelial architecture (requested in the Minor comment 2) could bring more light into this. However, we deduce that secretory activation is not impaired, as the presence of the milk observed on in situ wholemounts, and H&E-stained alveoli (Figure 3d) implies luminal secretion of milk components. The observed phenotype of the lactating mammary gland strongly suggests there is a structural abnormality inhibiting the milk ejection.

      The downregulation of key milk production genes and luminal keratins in the bulk RNA-seq data may be influenced by differences in tissue composition between samples. In control mice, more fully developed nipples and an extended ductal network likely contribute to a greater representation of differentiated luminal epithelial cells, thereby increasing the expression of these markers.

      Minor comments:

      1. Figure 1: Including an immunohistochemistry or immunofluorescence control confirming depletion of S100A4 expressing cells would strengthen the conclusions.

      We have now included Figure for Reviewers 5 that corresponds to Figure S7 in the revised manuscript and comment on the results in sections Results (lines 169-171) and Discussion (lines 257-262).

      In Results: “Interestingly, S100A4 antibody labeling revealed presence of S100A4+ cells in S100a4-Cre;DTA tissues (Figure S3b, Figure S7a,b).”

      In Discussion: “Notably, we observed incomplete depletion of S100A4+ cells in the mammary gland and nipple. Interestingly, a study using the same S100a4-Cre;DTA mouse model reported complete S100A4+ cell depletion in the superficial layer of mandibular condyle48. This suggests that incomplete depletion of S100A4+ cells in nipple and mammary gland is due to tissue-specific dynamics, rather than lack of depletion efficiency, indicating a compensatory mechanism that can balance the cell loss.”

      Figure for Reviewers 5 (Figure S7 in the revised manuscript): S100A4+ cells are found in S100a4-Cre;DTA nipple and mammary tissues. (a) Immunofluorescent labeling for S100A4 and vimentin on FFPE sections of DTA and S100a4-Cre;DTA L1 nipples. (b) Immunofluorescent labeling for S100A4 and smooth muscle actin on FFPE sections of DTA and S100a4-Cre;DTA L1 mammary gland. Scale bar = 100 µm.

      Figure 3c: The histological defects more accurately reflect failure of secretory activation rather than "lactation failure" per se. The terminology should be refined to reflect this more precisely.

      Thank you for this comment. As explained in the response to your major comment 3, we believe our results show that the secretory activation is conserved in S100a4-Cre;DTA lactating mice. We understand that “lactation failure” might be misleading terminology, as the production of the milk is conserved as well. We therefore change the phrasing into “nursing defect” (line 51, 73, 83), as this could reflect the phenotype most precisely.

      **Referee cross-comenting**

      I agree with the Reviewer, the authors do not need to do knockout experiments in the revised manuscript. However, it would be great if they could address my comment in the discussion.

      Reviewer #3 (Significance (Required)):

      This is an important study for mammary developmental biology, addressing the relatively understudied mechanisms that govern nipple development at the stromal-epithelial interface, and the determinants of lactational performance. A major strength is the elegant integration of DTA-mediated cell ablation, advanced imaging, lineage tracing, and transcriptomics to uncover previously uncharacterised roles for S100A4-expressing stromal populations in shaping nipple morphology and function. The work lays a foundation for future studies into nipple biology and pathologies and mechanisms underlying successful lactation.

      Although the study is already mature, it could be further strengthened by incorporating more specific genetic models, such as inducible S100A4CreERT or S100A4 gene knockout/knockdown approaches.

      Thank you for appreciation of our work.

      4. Description of analyses that authors prefer not to carry out

      Reviewer #1

      Major Comment 1.

      It is rather difficult to conclude whether the observed nipple phenotype reflects an early embryonic/prepubertal defect in establishing the nipple stroma, is caused by a constitutive response to ongoing cell death, or a response to continuous DTA expression (or a combination of some of these). The data raise a couple of additional questions: Is there a nipple phenotype at 3 wk of age?...

      Unfortunately, we cannot provide data on 3 weeks old mice because we did not collect such samples before and we had to terminate our mouse colony due to an infection in the animal house (mouse line reanimation is possible because we had stored sperm of the mouse line but it would take a lot of time and resources). Nevertheless, we tried to address this comment by providing other relevant available data (see Figure for Reviewers 1).

      Reviewer #2

      Major Comment 3.

      In Fig S1c, d and lines 93-96, the authors investigated the estrus cycles to determine the potential cause of lactation failure. The data was presented as the number of mice in each stage. A more intuitive approach would be to follow the same mice for two to three cycles and observe the duration of each stage.

      We agree that the suggested approach would be more accurate in determining truly cycling females. Unfortunately, we cannot perform this experiment currently because we do not have these mice alive anymore. Nevertheless, because the S100a4-Cre;DTA females bore pups, they had cycled and were fertile.

      Reviewer #3

      Major comment 1.

      While the S100A4Cre::DTA model is powerful for evaluating the roles of S100A4 expressing cells, the authors should discuss the potential outcomes of using S100A4 knockout or knockdown approaches. If the authors have such data available, this could help distinguish phenotypes caused by loss of S100A4 function itself from those arising due to ablation of S100A4 expressing cell populations and would add mechanistic depth to the study.

      We thank the Reviewer for this insightful suggestion. We agree that genetic approaches targeting S100A4 function (e.g., knockout or knockdown) could, in principle, help disentangle cell-autonomous effects of S100A4 from those resulting from the loss of S100A4-expressing cell populations. However, we would like to clarify that the primary objective of our study is to investigate the functional contribution of S100A4⁺ stromal cells at the population level, rather than to dissect the molecular function of S100A4 protein per se. In this context, the S100A4-Cre;DTA model provides a well-established and appropriate strategy to ablate this cell population and assess its role in tissue development. Importantly, S100A4 is not only a functional protein but also a widely used marker of a heterogeneous stromal cell population. Genetic ablation of S100A4 itself would not eliminate these cells, and may result in relatively subtle or compensable phenotypes due to functional redundancy within the S100 protein family or context-dependent roles of S100A4. Therefore, such approaches would address a distinct biological question and may not directly recapitulate the phenotypes observed upon cell ablation.

      References

      Eisen, E. J., & Durrant, B. S. (1980). Genetic and Maternal Environmental Factors Influencing Litter Size and Reproductive Efficiency in Mice. Journal of Animal Science, 50(3), 428–441. https://doi.org/10.2527/jas1980.503428x

      Ren, Y. A., Monkkonen, T., Lewis, M. T., Bernard, D. J., Christian, H. C., Jorgez, C. J., Moore, J. A., Landua, J. D., Chin, H. M., Chen, W., Singh, S., Kim, I. S., Zhang, X. H. F., Xia, Y., Phillips, K. J., MacKay, H., Waterland, R. A., Cecilia Ljungberg, M., Saha, P. K., … Richards, J. A. S. (2019). S100a4-Cre–mediated deletion of Ptch1 causes hypogonadotropic hypogonadism: Role of pituitary hematopoietic cells in endocrine regulation. JCI Insight, 4(14). https://doi.org/10.1172/jci.insight.126325

      Tuwatnawanit, T., Wessman, W., Belisova, D., Sumbalova Koledova, Z., Tucker, A. S., & Anthwal, N. (2025). FSP1/S100A4-Expressing Stem/Progenitor Cells Are Essential for Temporomandibular Joint Growth and Homeostasis. Journal of Dental Research, 104(5), 551–560. https://doi.org/10.1177/00220345251313795

      Zhang, R., Gao, Y., Zhao, X., Gao, M., Wu, Y., Han, Y., Qiao, Y., Luo, Z., Yang, L., Chen, J., & Ge, G. (2018). FSP1-positive fibroblasts are adipogenic niche and regulate adipose homeostasis. PLoS Biology, 16(8). https://doi.org/10.1371/journal.pbio.2001493

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this pre-print, Belisova et al. investigate the under-explored mechanisms regulating nipple development and its essential role in offspring nourishment, focusing on the contribution of S100A4‑expressing cells in the mouse mammary gland. The authors use an elegant combination of Cre::DTA-mediated cell depletion, lineage tracing, imaging, RNA-seq, and functional assays to reveal roles for S100A4‑expressing fibroblasts and immune cells in nipple morphogenesis and lactation. The manuscript is generally well written, and the experimental design is strong, with appropriate controls supporting the overall conclusions. However, I have several comments and suggestions to improve this initial manuscript.

      Major comments:

      1) While the S100A4Cre::DTA model is powerful for evaluating the roles of S100A4 expressing cells, the authors should discuss the potential outcomes of using S100A4 knockout or knockdown approaches. If the authors have such data available, this could help distinguish phenotypes caused by loss of S100A4 function itself from those arising due to ablation of S100A4 expressing cell populations and would add mechanistic depth to the study.

      2) The differential systemic versus mammary-specific effects of DTA-mediated S100A4 cell ablation are intriguing. The authors should address why the mammary fat pad appears unaffected. Are S100A4 expressing cells present during embryonic mammary development, or are they mainly postnatal? Would an inducible S100A4CreERT model lead to similar phenotypes, or might the timing of depletion influence the outcome? Discussing these points would reinforce the conclusions regarding the contribution of S100A4-expressing cells to mammary and nipple development and could also clarify the transient nature of the ductal branching phenotype.

      3) Although the authors attribute lactation failure primarily to defects in nipple architecture, the RNA seq data reveal downregulation of key milk production genes and luminal differentiation keratins, strongly suggesting impaired secretory activation. The authors should more explicitly discuss the relative contributions of epithelial functional maturation defects versus nipple structural abnormalities to the lactation failure observed upon S100A4+ cell depletion.

      Minor comments:

      1. Figure 1: Including an immunohistochemistry or immunofluorescence control confirming depletion of S100A4 expressing cells would strengthen the conclusions.

      2. Figure 2c: The H&E images are not fully convincing. Immunofluorescence analysis of epithelial architecture would support the authors' interpretation and should be feasible if tissues are already available.

      3. Figure 3c: The histological defects more accurately reflect failure of secretory activation rather than "lactation failure" per se. The terminology should be refined to reflect this more precisely.

      4. Figure 4f: The proliferation data are compelling, but the authors could extend this by examining how cell differentiation and epithelial organisation are affected.

      5. Figure 5b: To more convincingly show that GFP+ cells contact endothelial cells, co-labelling with an endothelial marker such as CD31 would be helpful.

      6. Figure 5f-h: The structures referenced in the text (lines 159-163) should be clearly indicated on the immunofluorescence images.

      Referee cross-comenting

      I agree with the Reviewer, the authors do not need to do knockout experiments in the revised manuscript. However, it would be great if they could address my comment in the discussion.

      Significance

      This is an important study for mammary developmental biology, addressing the relatively understudied mechanisms that govern nipple development at the stromal-epithelial interface, and the determinants of lactational performance. A major strength is the elegant integration of DTA-mediated cell ablation, advanced imaging, lineage tracing, and transcriptomics to uncover previously uncharacterised roles for S100A4-expressing stromal populations in shaping nipple morphology and function. The work lays a foundation for future studies into nipple biology and pathologies and mechanisms underlying successful lactation.

      Although the study is already mature, it could be further strengthened by incorporating more specific genetic models, such as inducible S100A4CreERT or S100A4 gene knockout/knockdown approaches.

      I have expertise in mammary epithelial biology.

      I estimate that revisions would require 3-6 months if new experiments are performed, and 1-3 months if revisions focus on clarifying claims and strengthening the discussion.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, Jaros Belisova et al. systematically investigated the composition and roles of S100A4+ cells during mammary gland development and identified a novel role for S100A4 for nipple development. Depletion of S100A4+ lineage using the S100a1-Cre;DTA model resulted in delayed pubertal mammary gland development but with normal morphology and milk production capacity during lactation. The authors further investigated the milk-ejection function of the alveoli using an ex vivo culture method combined with live imaging. This revealed that depletion of S100A4+ cells does not interfere with the normal function of alveoli. However, the abnormal development of the nipple, characterised by smaller size, shorter length, lacking protrusion, increased collagen composition and decreased cell proliferation at the onset of lactation, results in milk delivery failure which is responsible for the lethality of the pups. To further understand the consequences of S100A4+ cell depletion, the author utilised the S100a4-Cre;mTmG model to trace the cell types depleted in the DTA model across various developmental stages. Immunofluorescent staining revealed that S100A4 lineage cells comprised both fibroblasts and immune cells, consistent with previous studies. Interestingly, some S100A4 lineage (GFP+) retain the expression of S100A4. In addition, the RNAseq data comparing the nipple cells from S100a4-Cre;DTA and DTA lactation mice confirmed their observations in transcription level. Overall, the experiments are well designed and the key findings are valid, especially about the role of S100A4 during nipple development is novel and interesting.

      Major comments:

      1. My key concern is the discussion part. I think the authors need to re-organize/re-phrase the discussion part, it confused me a bit in terms of logic, phrases and interpretation of literatures. Here are few examples:

      a. The lines 195-199 contain lot of repeated information

      b. The authors mentioned the studies in ref 26,28 and 38 using "targeting S100A4+ cells through knockout experiment can result in sever phenotypes". This is very misleading. Those studies using the same (or similar if the origin is different) S100A4-Cre line as the current study but induced the activation of Wnt and sHH signalling pathways, respectively. The observed phenotypes are largely due to the pathway function, rather than the S100A4 gene or normal S100A4+ cell itself. This is significantly differed from the current study.

      c. In the lines 253-255, why the author believe complete S100A4+ depletion would leads to the fatal of mouse? Is there study suggest that? Or have authors checked the expression of S100A4 in the S100A4-Cre;DTA model to confirm the efficiency?

      d. The authors tried to attribute the minor phenotype to the incomplete depletion of S100A4+ cells. However, it is possible that if the S100A4+ cells only represented a minor population, their function may be compensated by other populations. This might be confirmed by quantification of S100A4+ cells or S100A4-Cre; GFP+ cells in fibroblast or CD45 populations from images showed in Figure 5. 2. In Fig. 1, the authors described the impaired nursing capacity of S100A4-Cre;DTA dam. However, it seems the little size is also smaller (Fig 1a). Do authors have any explanation or hypothesis? 3. In Fig S1c, d and lines 93-96, the authors investigated the estrus cycles to determine the potential cause of lactation failure. The data was presented as the number of mice in each stage. A more intuitive approach would be to follow the same mice for two to three cycles and observe the duration of each stage. 4. The images in Figure 5 and Figure S4 are difficult to confirm colocalization. A higher magnification image would be required for each panel. Furthermore, a precise quantification based on the current images would be more supportive of the conclusion regarding the discrepancy of the composition of S100A4 lineage between epidermis and mammary gland (lines 163-165). 5. Line 163, the author hypothesis the Langerhans cells due to morphology. Those cells should be able to be confirmed by a co-staining with F4/80 in addition to the current form of Fig 5h. 6. In lines 181-184, the authors states "the results showed that the tissue reacted to a foreign chemical or an endogenous compound....." , which results are referring here? I could not find any inflammation related GO terms in figure 6b. It would be more accurate to specify them in lines 179-181, which appears to be a technical statement rather than a result in current form. 7. The lines 182-184 was not clear. Does the author refer the "nipple tissue response" in general as malfunction of development or inflammation and tissue repair as mentioned in the previous sentence? If the later cases, the authors should consider the failure of lactation might mimic the involution, which may cause the apoptosis and inflammation as well. This might be independent of the DTA expression.

      Minor comments:

      1. The authors demonstrated in Figure S1 and lines 92-96 that no significant differences were observed in pituitary glands and ovaries in S100a4-Cre:DTA and DTA mice. Have the authors checked the S100A4 expression or lineage cells in these organs, or have been reported by others?
      2. The authors have performed live imaging to evaluate the contraction of alveoli. It would be better to include a video together with the snapshots showed in Figure S2.
      3. Since the study is mainly using S100a4, it would be better to avoid using FSP1 in the results, for example Fig 5h.
      4. What does L1 stand for? Lactation Day 1? It should be spelt out in the first instance.
      5. Line 150. Figure S4 should be Figure S4a.

      Referee cross-comenting

      I agree with the other reviewers, as well as the Consultation Comments. The manuscript would benefit greatly from a thoroughly optimised Discussion section to address issues raised by all reviewers.

      Significance

      • Overall, this study is well designed and the key findings are valid, especially the role of S100A4 during nipple development is novel and interesting.
      • One limitation of the study is that RNA-seq was performed using a mixture of all cell types present in the nipple. While this approach is reasonable-given that depletion of the S100A4+ lineage may exert both direct and indirect effects contributing to nipple dysfunction-it should be more clearly acknowledged and discussed in the manuscript. Additionally, this experimental design may limit the utility of the dataset for other researchers interested in nipple development and the specific functions of S100A4.

      My expertise:

      mammary gland development and breast cancer

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this study, Belisova et al. investigate the function of S100a4+ (a.k.a. Fsp1) cells in the mammary gland. S100a4 expressing cells were constitutively ablated using the DTA system by crossing S100a4-Cre mice with ROSA26-eGFP-DTA mice. Female mice exhibited a severe nursing defect, leading to whole-litter mortality within 1-2 days postpartum. However, no abnormalities were detected in the morphology of the mammary ductal tree, milk production, or alveolar contractility of S100a4-Cre;DTA mice. Instead, nipples were malformed, likely prevent normal suckling. Analysis of the lineage of S100a4 expressing cells in the mammary gland using the S100a4-Cre mouse in combination with a fluorescent Cre reporter identified S100a4+ cells as fibroblasts and immune cells in the nipple region, while only immune cells were labelled in the mammary gland stroma, findings that agree with previous studies.

      Major comments:

      1. It is rather difficult to conclude whether the observed nipple phenotype reflects an early embryonic/prepubertal defect in establishing the nipple stroma, is caused by a constitutive response to ongoing cell death, or a response to continuous DTA expression (or a combination of some of these). The data raise a couple of additional questions: Is there a nipple phenotype at 3 wk of age? It would not be totally unsurprising if ablation of a major fraction of dermal fibroblasts in the nipple area would lead to an early embryonic/prepubertal phenotype but there is no data on this. Hence, is there a "congenital" nipple deformity, as concluded by the authors (line 191)? Are there S100a4+ cells in the nipple area of pubertal S100a4-Cre/DTA mice? I.e. is there a continuous supply of new S100a4+ cells and thereby continuous cell death and DTA expression as one might expect based on the RNA-seq data?
      2. The subtitle on line 54 implies that that S100a4-Cre/DTA mice display a branching phenotype. However, it looks to me as if there is a pubertal outgrowth defect (as is also written in the body text, line 64) rather than a branching phenotype, potentially reflecting the much smaller size of S100a4-Cre/DTA mice (Fig. 2a). Unless there is a change in branch point frequency, I suggest rephrasing the title and discussion. Instead, I suggest the authors discuss the observed outgrowth delay considering the gross overall growth defect (Fig. 2a). If ductal outgrowth was normalized to the overall growth defect, would one still observe 'a delay in branching morphogenesis'?
      3. Fig. 4e shows Masson's Trichrome and Picrosirius Red staining and the authors report the findings as follows (lines 120-124): "collagen fibers were loosened in the DTA nipples and more densely packed in the S100a4-Cre;DTA nipples". Perhaps the authors could help non-specialists to observe the loosened fibers and if they wish to make quantitative statements ("more densely packed"), such statements should be backed-up by quantifications.
      4. I found the Discussion on the various mouse models somewhat problematic. Overall, the paper is written is a way that it often remains unclear whether it refers to studies addressing the role of S100a4 itself, studies addressing the function of S100a4+ cells via ablation approaches (S100a4-Cre or S100a4-CreERT2 crossed with floxed DTA), or those where S100a4-Cre has been used to delete gene X/Y/Z. These are all very different experimental approaches where one approach is not necessarily informative when trying to understand the results from another one. The authors should make these points clear and consider whether all their discussion points are relevant. The abstract states S100a4 (fibroblast-specific protein 1) is "expressed by mesenchymal cells and has been implicated in the development of eccrine glands, hair follicles, and mammary branching morphogenesis". However, the study on eccrine glands (ref. 19) shows that S100A4+ cells play a role in eccrine gland development but it does not address the role of S100a4 itself, while the study on hair follicles (ref.20) in turn reports the expression pattern of S100a4 in hair follicles but does not address its function, nor the role of S100a4+ cells. Finally, I failed to find references in the paper to studies addressing the role of S100a4, or S100a4+ cells in the mammary gland. Instead, the paper had references to studies where S100A4-Cre had been used to delete different genes and these mice had various mammary phenotypes - which, as indicated above, is a very different approach compared to deleting S100a4 or ablating S100a4+ cells.

      In Discussion (lines 197-200), the authors write: "We described significant delay in mammary branching morphogenesis in puberty, confirming an important role for S100A4+ cells in mammary development, as it was previously described (refs 37-39)." It should be noted that none of these studies addressed the role of S100A4+ cells:

      • Ref 37 used S100a4-Cre to delete sharpin
      • Ref 38 used the same Cre line to delete Ptch1, did not address the role of S100a4 or S100a4 expressing cells
      • Likewise ref 39 deleted another gene using S100a4-Cre

      Later on in Discussion, the authors compare the reported phenotype to previous studies (lines 248-255): "...targeting S100A4+ cells through knockout experiments can result in severe phenotypes, such as a reduction in adipose tissue (ref 26), skin phenotypes, a disrupted estrous cycle, reduced fertility (ref. 38), and complete infertility, hypogonadism and defects in pituitary endocrine function (ref. 28). Of these, Ref. 26 used the same approach as the current study (S100a4-Cre; DTA) (Fig. 7A in the paper) - these mice were significantly lean, with markedly reduced fat compared with the control mice - also the mice in the current study are very small, so perhaps they could also be described as 'lean'. Yet ref. 26 reports that female mice had comparable food uptake, respiratory exchange ratio and physical activity, and slightly increased energy expenditure

      Ref. 38 (as mentioned above) reports deletion of Ptch1 using S100a4-Cre lines and these mice "displayed a disrupted estrous cycle and dramatically reduced fertility over 6.5 weeks". However, this has nothing to do with the approaches where Fsp1/S100a4+ cells are depleted with DTA. Likewise, reference 28 analyzed the phenotype of S00a4-Cre;Ptch1fl/fl mice. Obviously, deleting Ptch1 using S100a4-Cre mice is quite a different approach than "targeting S100A4+ cells" through knockout experiments". Ptch1 deletion leads to a combination of gain-of-function (of Hedgehog activation) and loss-of-function (loss of Hh-independent functions of Ptch1) and hence comparisons with these phenotypes is rather challenging. I suggest the authors focus their phenotype comparisons to ref. 26 where S100a4/Fsp1+ cells were ablated with DTA, i.e. the same approach as in the current study. 5. I find the Discussion is somewhat off the topic by starting with WHO recommendations on breastfeeding and linking this to observed mouse phenotype. Overall, the discussion is rather long and from time-to-time more like a literature review. I would recommend keeping the Discussion more succinct and focused.

      Minor comments:

      Figure 5 would be more informative if it included more higher magnification images that would reveal the staining at the cellular level.

      Referee cross-comenting

      I agree with the comments of other reviewers. However, to me it seems that the analysis of S100a4 knockout mice would not be feasible within a reasonable timeframe and would represent a study of its own. My understanding was that the authors were not interested in S100a4 itself. Rather, S100a4-Cre was used as a tool to understand the importance of a certain (fibroblast) cell population for mammary gland morphogenesis.

      Significance

      General assessment:

      This study reveals the importance of the S100a4+ cell lineage for nipple formation while showing the same cells are dispensable for mammary gland morphogenesis. The main limitation is that it remains unclear whether the observed nipple phenotype is derived from an early embryonic/prepubertal defect in establishing the nipple stroma, is caused by a constitutive response to ongoing cell death, or a response to continuous DTA expression (or a combination of some of these). Hence its relevance as a model of human inverted nipple condition remains rather speculative.

      Advance:

      This study provides novel information on nipple morphogenesis, with potential (though with reservation) relevance to the congenital human inverted nipple condition affecting 3-5% of women.

      Audience:

      This work should appeal to mammary gland biologists interested in mammary gland development and nipple formation; those with interest on fibroblasts biology given that S100a4 was once thought to be a broad marker of fibroblasts, as well as those with interest in the inverted nipple condition.

      My expertise:

      Mammary gland morphogenesis, developmental biology, cell signaling

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their careful evaluation of our manuscript and their constructive comments. Overall, the reviewers recognize the technical value and potential impact of the pBID2 platform as a unified framework for generating transgenic tools across multiple binary expression systems in Drosophila.

      In response to the reviewer’s suggestions, we will strengthen the manuscript in two main directions. First, we will perform additional experiments to further support key claims regarding the QFG4 system, including (i) assessing temporal dynamics of transgene expression across multiple time points, (ii) extending validation to additional tissues, (iii) generating new driver lines. (iv) In addition we will confirm co-expression of pBID2-VGMN-GAL4 and pBID2-VGMN-QF2 in the same neurons. These experiments are currently underway and will directly address concerns regarding synchronicity and reporducibility. Second, we will revise the manuscript to improve clarity, accuracy, and context within the existing literature. This includes justifying claims where appropriate, refining terminology, expanding discussion of prior work, and improving data presentation.

      Reviewer # 1 __Major Comments __

      1 - Synchronous temporal control: multiple time points in Fig. 7

      Synchronous temporal control has not been rigorously demonstrated in Fig. 7. Only a single time point (7 days after temperature shift) was examined. The synchronicity of expression between the two systems remains unclear, and a temporal delay between them is possible. I suggest examining multiple time points to assess true synchronicity.

      We thank Reviewer 1 for the positive assessment of the study and for the constructive suggestions. We agree with this concern. Our current data examine only a single time point after temperature shift, which is insufficient to support claims of synchronous regulation between the GAL4 and QFG4 systems. We plan to repeat the Fig. 7 experiment and collect imaging data at multiple time points spanning approximately one week after temperature shift. The resulting data will be incorporated into a revised Fig. 7, and the text will be updated accordingly.

      2 - Coordinated control of Gal4 and QFG4 activity in additional tissues

      Expression in distinct cells or tissues is demonstrated only in VGMN and muscle. If the claim is intended to be broadly applicable, additional examples would strengthen it. Including other tissues or cell types would provide stronger support.

      We agree that demonstrating QFG4 in additional tissues is important to substantiate the generality of the approach. We will test the coordinated response of a nSyb-QFG4 driver line (in neurons) together with a Repo-GAL4 line (in glia), in the adult brain. This experiment directly addresses the reviewer's request for a second tissue context and provides a biologically meaningful example of intersectional control across two distinct cell types within the same organ.

      3 - New QFG4 driver lines: demonstrating pipeline scalability

      Demonstrating additional QFG4 lines targeting other tissues would highlight the versatility and scalability of the approach and would represent valuable community resources. Given that this is a methodological paper focused on pipeline development, such experiments would directly test the ease and efficiency of the system.

      We plan to use the pBID2 pipeline to generate at least one additional QFG4 driver line targeting another cell type, which we will image to confirm expression patterns. These lines will serve as further proof-of-concept for the scalability of the platform and will be deposited as community resources.

      __Minor Comments __

      All minor comments from Reviewer 1 will be addressed through text and figure revisions:

      • The typographical error "Kusubira" (line 111) will be corrected to "Kusabira".
      • Figure resolution will be improved: Fig. 3B and 3C panels will be enlarged to better demonstrate nuclear localisation of mKO2, and Fig. 7B will include higher magnification images to illustrate differential localisation between VGMN and muscle.

        Reviewer # 2 __Major Comments __

        1 - Novelty of this study

      There is very little novelty in this study. The gateway compatible vectors to generate LexA, Gal4 and QF drivers were generated in Janelia years ago and are currently in use. The only observable difference is the use of insulators in this manuscript.

      We thank the reviewer for their critical evaluation of our work. We respectfully disagree with the reviewer’s assessment that the study lacks novelty. While individual components such as Gateway-compatible vectors and binary expression systems have been previously developed, the pBID2 platform provides a unified, modular framework that minimizes transcriptional leakage (through the use of Gypsy insulators and a DSCP promoter) and achieves strong expression (through a p10 UTR terminator, multiple repeats of activator sequences, and a Syn21 element upstream of drivers), integrating the GAL4/UAS, LexA/LexOP, and QF/QUAS systems within a single architecture. This standardization enables the streamlined generation of complex transgenic combinations that would otherwise be fragmented. In addition, the QFG4 system introduces a GAL80-sensitive QF-based activator, enabling coordinated temporal regulation across independent binary systems. We believe this represents a conceptual advance beyond existing implementations. To better reflect this contribution, we will revise the Introduction and Discussion to more clearly position our work relative to existing tools and explicitly acknowledge prior developments, including Janelia-based constructs.

      2 - P2A versus T2A

      T2A is the 2A peptide that has been used in Drosophila research. P2A was shown to work worse than T2A in the Diao and White 2012 paper. This decreases the novelty of this finding.

      We will revise the relevant sections of the Results and Discussion to more accurately reflect the existing literature on 2A peptide performance in Drosophila, including the findings of Diao and White 2012 and their demonstration of T2A efficacy. We will clarify the rationale for our use of P2A in the pBID2 system and discuss this choice in proper context of both the Daniels et al. 2014 and Diao and White 2012 publications, as suggested also by Reviewer 3.

      3 - QFG4 co-regulation and leakiness

      The proof of principle experiment does not show how co-regulation of QF driver and Gal4 driver by Gal80 can be beneficial, and shows that the regulation of QFG4 is not as tight when used in conjunction with a Gal4 driver. The reason for the leakier QFG4 regulation is not clear and not explored.

      The reviewer raises important points regarding the functional advantages and potential limitations of QFG4, including its regulatory tightness and biological utility. To address this, we will (i) expand the Discussion to better articulate the contexts in which coordinated regulation of independent systems is advantageous, (ii) clarify that QFG4 is intended as a flexible tool whose performance may vary depending on experimental context and discuss the observed differences in repression efficiency between QFG4 and GAL4, (iii) moderate our claims where appropriate to reflect the current level of validation. In addition, the new experiments outlined above for Reviewer 1 (multiple time points and additional tissues) will provide further insight into the performance and applicability of the system.

      __Minor Comments __

      All minor comments from Reviewer 2 will be addressed through text revision:

      • The use of "permissive" and "restrictive" temperatures will be corrected throughout to align with conventional usage in the field (restrictive = 29°C, permissive = 18°C).
      • The discussion of the LexA-GAD strategy will be incorporated into the Results section where relevant, rather than appearing only in the Discussion.
      • The Diao and White 2012 reference will be appropriately cited alongside Daniels et al. 2014 in the P2A/T2A discussion.

      Reviewer # 3 Major Comments

      Reviewer 3 raised no major experimental concerns and found the data sufficient to support the main claims of the paper. All comments from Reviewer 3 will be addressed through text and figure revisions. Nevertheless, we are still planning to perform an additional experiment in response to the remarks about the VGMN dual labelling in the same cells.

      New experiment

      1 - Co-expression of VGMN-GAL4 and VGMN-QF2 in the same neurons

      It is not clear if pBID2-GAL4 and pBID2-QF2 constructs express in exactly the same neurons, e.g., with VGMN. Figure 4 shows independent labelling, but it is not clear if these were validated as the exact same expression pattern. Dual labelling experiments in the same animal would clarify this.

      We thank the reviewer for recognising the relevance of the topic and for this valuable suggestion. We plan to perform dual-labelling experiments using the VGMN enhancer to directly compare the expression patterns driven by pBID2-VGMN-GAL4 and pBID2-VGMN-QF2 within the same cells. Fig. 4 currently shows independent labelling in separate animals, which does not allow direct comparison at single-cell resolution. The dual-labelling data will allow us to confirm whether the two constructs drive expression in the same neurons and will directly support the claim that pBID2 produces equivalent and interchangeable driver lines across binary systems.

      Minor Comments

      All minor comments from Reviewer 3 will be addressed through text revision:

      • Figure 1 will be revised to add arrows indicating that Activator and Responder constructs are inserted at position 0 of pBID2.
      • The name of the MCS variant (pBID2-MCS) will be made explicit in the relevant results section (lines 86-87).
      • The contribution of Diao and White 2012 to the validation of T2A in Drosophila will be more clearly described in the Results section, and Ref #68 will be cited at line 120.
      • The Figure 3 legend labelling errors ("B)" covering panels B and C; "C)" covering panel D) will be corrected.
      • Lines 258-260 will be revised: the discussion of GAL80 binding to the GAL4 activation domain will be clarified to avoid implying a role for the middle domain without supporting experimental data. As correctly noted by Reviewer 3, demonstrating a role for the middle domain would require a QFG4 construct using only the GAL4 activation domain.
      • The typographical error "otor" in the Fig. 7 legend will be corrected to "motor".
      • A comparison of pBID2-UAS constructs with Janelia UAS constructs (e.g., pJFRC7-20XUAS-IVS-mCD8::GFP) will be added to the Discussion, including any direct comparisons we have performed.
      • Dual-labelling experiments to confirm co-expression of pBID2-GAL4 and pBID2-QF2 in the same neurons will be performed (see Major Comment 4 above).
      • The Acknowledgments will be corrected: "Christopher G. Potter" will be corrected to "Christopher J. Potter".
      • The Materials and Methods section will be corrected: "CsChrismson" will be corrected to "CsChrimson".
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In the manuscript by Ruchti and McCabe, the authors introduce and validate many new constructs for use in generating new transgenic reagents in Drosophila. The authors introduce a number of improvements, including the pBID2 plasmid for generating flies that utilize the binary expression systems GAL4/UAS, LexQF/LexOp, QF2/QUAS. The authors generated two new reporters that introduce a nucleus-membrane marker to simultaneously label both the nucleus of the cell and its processes. The authors also identify a new enhancer derived from the VGlut genomic area that drives specific expression in motor neurons (labeled as VGMN). The authors also introduce a new activator, QFG4, which utilizes the QF DNA binding domain and the Gal4 activation domain. They demonstrate that this new reagent induces robust expression and has the additional benefit of being GAL80 sensitive. Overall, this work represents new and useful additions to the Drosophila toolkit that have the potential to become widely adopted.

      Major Comments:

      I do not have any major concerns regarding the work presented. The authors demonstrate the practical use of their new reagents via a number of experiments with new transgenic flies. As such, the conclusion that these new reagents are an improvement over existing reagents is justified. Additional experiments are not necessary to support the major claims on this paper. The data and methods are presented in a way that allows reproduction as well as utilization of the newly introduced reagents. The figures are well presented and adequately demonstrate the function of the new reagents in transgenic Drosophila.

      Minor Comments:

      In Figure 1, it was not entirely clear that the Activator constructs and the Responder Constructs have been inserted at position 0 of the pBID2 construct. Perhaps adding an arrow onto the lines that point to 0 could make this point clearer.

      Line 86-87. The authors have a variation of pBID2 that uses a MCS. What is the name of these constructs? Please add this to this section so its obvious. I assume it is pBID2-MCS as reflected in Figure S2.

      Regarding T2A (lines ~106-146). T2A was first validated to be useful for transgene expression by Ref#68 (Diao and White 2012.). This paper is why many current Drosophila constructs use T2A. This should be better reflected in the results section when reporting on the use of T2A and P2A experiments. As written, it was not clear that T2A was previously validated as a useful method for expression in Drosophila. As one example that could be updated, Ref #68 should also be cited on line 120 "we used ribosomal skipping sequences (63-65)".

      Figure 3 legend. "B)" should be "B) and C)". "C)" should be "D)".

      Lines 258-260. GAL80 binds directly to the activation domain of GAL4 at its C-terminus (~aa 761- 880). The middle domain likely doesn't play a role in GAL80 binding and might just function for structural stability. To make this statement in the discussion, the authors would need to make a QFG4 that uses just the GAL4 activation domain without its middle domain, similar to what was used to make QF2.

      Figure 7 legend. 3rd to last line. "otor" should be "motor".

      To the discussion section, please comment on how pBID2-UAS constructs might compare to Janelia UAS constructs, eg., pJFRC7-20XUAS-IVS-mCD8::GFP. If the authors have made direct comparisons, it would be helpful to include their observations. The Janelia constructs have similar features, and it would be helpful to include the authors thoughs on why to choose pBID2-20xUAS vs pJFRC7-20xUAS (for example).

      To the results or discussion section, please comment if the authors have examined if pBID2-GAL4 and pBID2-QF2 constructs express in exactly the same neurons (eg., with VGMN). For example, by conducting dual labeling experiments in the same animal. Figure 4 shows independent labeling, but it is not clear if these were validated as the exact same expression pattern. As the authors correctly pointed out, the promoter can influence expression (hsp70 or DSCP), but so can sequences from the transcription factor (eg., GAL4 or QF). It is possible the gypsy insulators have addressed these issues, but if the authors have data demonstrating that the exact same expression patterns are induced by the different constructs, it would be helpful to include.

      Acknowledgments. Bibliography lists papers by a Christopher J. Potter, not a Christopher G. Potter.

      Materials and Methods, page 13, pBID Gateway responder vector series. It should be "CsChrimson" not "CsChrismson"

      Significance

      This work represents a significant technical advance to the Drosophila toolkit. It introduces and validates many new reagents (both activators and reporters) that will prove useful to the Drosophila community. These new reagents will enable both simple and complex experiments to be more efficiently performed in Drosophila, especially those interested in investigating complex tissues such as the brain.

      This work will be of primary interest to those developing new reagents for studying Drosophila biology, as well as those interested in genetic tool development. The reagents developed here could also be applied to other genetic systems, such as other insect models.

      This reviewer's expertise is in development genetic tools for use in Drosophila and other insects, and in applying these new genetic methods to the field of neuroscience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript Ruchti and McCabe describe incremental technical developments to create Gateway compatible vectors with increased transgene expression or driver expression, a dual reporter that expresses membrane targeted fluorescent protein together with nuclear targeted fluorescent protein and a QF version that can be repressed by Gal80 (QFG4).

      Major issues:

      • There is very little novelty in this study. The gateway compatible vectors to generate LexA, Gal4 and QF drivers were generated in Janelia years ago and are currently in use. The only observable difference is the use of insulators in this manuscript. Leakiness of driver lines inserted in well characterized landing sites is not a great concern.
      • Except for the initial studies that showed P2A can work in Drosophila cells other Drosophila, T2A is the 2A peptide that has been used in Drosophila research. P2A was shown to work worse and T2A in the Diao and White 2012 paper. This decreases the novelty of this finding.
      • The proof of principle experiment in the paper do not show how having co-regulation of QF driver and Gal4 driver by using Gal80 or Gal80ts can be beneficial and if anything shows that the regulation of QFG4 is not as tight when used in conjunction with a Gal4 driver. The reason for the leakier QFG4 regulation compared to Gal4 regulation is also not clear and not explored.

      Minor issue:

      • The authors use the terms permissive temperature and restrictive temperature in a manner that is against the conventional use. These terms conventionally refer to functionality of Gal80ts in the given temperature, not the activity of Gal4 which is negatively correlated. Hence, in literature restrictive temperature typically refers to 29 C and permissive temperature is 18 C. This is confusing.
      • Although the references are cited, the information about some of the papers are not properly presented. For example, LexA-GAD strategy is only brought up in discussion but in results, the authors make the statement that no Gal80 regulation strategies exist for LexA. When discussing the use of P2A versus T2A the authors mainly refer to Daniels et al. 2014 publication whereas it was also done in Diao and White 2012. The authors do cite both of these papers but discuss the findings in an incomplete manner.

      Significance

      This manuscript has very limited novelty and is better suited for a more specialized journal such as G3.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      Ruchti et al. developed new vectors compatible with the GAL4/UAS, LexA/LexOP, and QF/QUAS systems to enable high throughput construct generation (pBID2). They validated the platform by generating responder constructs (UAS, LexOP, and QUAS lines) and VGMN (glutamatergic motor neuron) driver lines. In addition, they engineered a new hybrid binary system combining QF and GAL4, termed QFG4, and tested its capacity to modulate expression levels. This design permits regulation of QUAS by GAL80, as demonstrated by experiments examining expression of ontogenetic proteins and co expression of transgenes under VGMN and MNC control with GAL80ts. Overall, the experiments are well designed, carefully performed, and quantitatively analyzed. The vectors and fly lines generated will be valuable resources if deposited in Addgene and the Bloomington Drosophila Stock Center. However, several claims appear speculative or overstated, as outlined below.

      Major Comments

      • Are the key conclusions convincing?

      Most experiments are of high quality and generally convincing. However, two conclusions would benefit from further clarification: (i) simultaneous regulation by GAL4 and QFG4, and (ii) coordinated expression of two transgenes in distinct tissues. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The summary states that QFG4 "enables simultaneous coordinate regulation of UAS and QUAS transgenes by Gal80, allowing synchronous temporal control of independent transgene expression in distinct cells or tissues." I have several concerns regarding this claim: 1. Synchronous temporal control has not been rigorously demonstrated in Fig. 7. Only a single time point (7 days after temperature shift) was examined. The synchronicity of expression between the two systems remains unclear, and a temporal delay between them is possible. I suggest examining multiple time points to assess true synchronicity. 2. Expression in distinct cells or tissues is demonstrated only in VGMN and muscle. The title also refers to "discrete Drosophila tissues." If the claim is intended to be broadly applicable, additional examples would strengthen it. Including other tissues or cell types would provide stronger support. 3. Related to point (2), the pBID2 system is presented as a pipeline for generating new lines. Demonstrating additional QFG4 lines targeting other tissues would highlight the versatility and scalability of the approach. Such lines would also represent valuable community resources. - Would additional experiments be essential to support the claims of the paper? Additional experiments would be necessary unless the authors adopt more conservative language in their claims. Are the suggested experiments realistic in terms of time and resources? The proposed experiments appear realistic.

      For point (1), the authors could repeat the existing experiment and collect images at multiple time points over approximately one week, as the fly lines are already available. For points (2) and (3), generating two to three additional QFG4 lines and imaging their expression in distinct tissues would provide meaningful validation. Given that this is a methodological paper focused on pipeline development, such experiments would directly test the ease and efficiency of the system. Generating new lines may require approximately 2-3 months, followed by ~1 month of imaging and analysis, which is a reasonable investment. - Are the data and methods presented in a reproducible manner?

      Yes, the presentation is generally clear and detailed enough to ensure reproducibility. - Are the experiments adequately replicated and statistically analyzed?

      Overall, replication and statistical analysis appear appropriate. However, inclusion of additional time points for Fig. 7 would strengthen the conclusions.

      Minor Comments

      • Specific experimental issues that are easily addressable

      No major additional concerns beyond those noted above. - Are prior studies referenced appropriately?

      A minor typographical issue: Line 111 lists "Kusubira," which should be corrected to "Kusabira." - Are the text and figures clear and accurate?

      As noted above, certain phrases (e.g., "simultaneous coordinate regulation," "distinct cells or tissues," "discrete tissues") appear overstated relative to the current data. Clarifying or moderating this language would improve accuracy. Some figures require higher resolution presentation. In Fig. 3B and 3C, the images are too small to clearly demonstrate nuclear localization of mKO2; larger panels would help. In Fig. 7B, higher magnification images would better illustrate differential localization between VGMN and muscle. - Do you have suggestions to improve presentation?

      Overall, the data presentation is strong and well organized. Clarifying the scope of the claims and providing higher resolution images where noted would further improve the manuscript.

      Significance

      • Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field.

      Drosophila research relies heavily on binary expression systems for spatial and temporal control of gene function. The unified vector platform developed here, incorporating Gateway compatibility across GAL4/UAS, LexA/LexOP, and QF/QUAS systems, represents a meaningful technical advance. By streamlining construct generation across multiple systems within a single framework (pBID2), the authors lower the technical barrier for complex genetic manipulations. The previously developed pBID system for UAS/GAL4 has been widely adopted and highly cited, underscoring community demand for standardized and scalable tools. The current expansion to additional binary systems is therefore likely to have broad impact. Inclusion of a few additional validation experiments, as noted above, would further strengthen confidence in the robustness and versatility of the platform. - Place the work in the context of the existing literature (provide references, where appropriate).

      This work builds directly upon foundational binary expression systems widely used in the Drosophila field, including GAL4/UAS (Brand and Perrimon, 1993), LexA/LexOP (Lai and Lee, 2006), and QF/QUAS (Potter et al., 2010). By providing a unified and modular cloning strategy compatible with these systems, the authors enhance the practicality and interoperability of established genetic tools rather than introducing an entirely new paradigm. This technical consolidation is valuable for laboratories that routinely combine multiple binary systems for intersectional or parallel manipulations. - State what audience might be interested in and influenced by the reported findings.

      The primary audience will be researchers working with Drosophila genetics, particularly those employing complex intersectional strategies, circuit mapping, developmental biology, and functional manipulation of defined cell populations. Laboratories developing new driver or responder lines will especially benefit from the streamlined cloning pipeline. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Field of expertise: Drosophila genetics, neurobiology, binary expression systems, and circuit analysis. I have sufficient expertise to evaluate the genetic and technical aspects of the manuscript.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The study 'Excess Met1-ubiquitination leads to solid aggregate formation' by Kaypee et. al suggests a previously unrecognised role for the E3 ligase HOIL-1 in clearing protein aggregates via autophagy (e.g. aggrephagy). In their model, toxic protein aggregates in cells are modified with ubiquitin chains, including M1-linked Ub-chains catalysed by LUBAC (of which HOIL-1 is a component). The HOIL-1 ubiquitin signal is posited to induce trafficking of aggregates to lysosomes for subsequent clearance. However, when HOIL-1 is inactive (catalytic C460A mutation), the pathway is interrupted. As a result, protein aggregates fail to clear, they increase in size and shift their biophysical properties from liquid-like to more rigid, insoluble aggregates. The authors explain their observations by an increasing amount of M1-linked chains on protein aggregates, which occur as a result of 'unrestrained HOIP activity' due to HOIL-1 inactivity (based on previous work). Increasing amounts of M1-chains are posited to promote aggregate formation, aggregate growth, and prevent clearance.

      The major claims made in this manuscript are the following:

      1. Following the induction of protein aggregate formation (e.g. alpha-synuclein, tau, beta-amyloid, p62 bodies), cells that express catalytically inactive HOIL-1 fail to clear protein aggregates and end up with more, larger and more rigid protein aggregates compared to cells that express WT HOIL-1.
      2. The observations made in 1. are due to disruptions in late-stage autophagic flux. While aggregates in cells that express WT HOIL-1 co-localise with autophagy and lysosomal markers, aggregates in cells that express mutant HOIL-1 show autophagic, but not lysosomal markers.
      3. An increase in M1-chains (either due to HOIL-1 inactivity or due to OTULIN knockout) is believed to be the cause for claims 1. and 2.

      Main methodologies used:

      The authors use two cellular systems. The first one is SH-SY5Y cells in which either WT or mutant HOIL-1 are transiently overexpressed (via the pcDNA3.1 plasmid), and physiologically important aggregates (Tau, Abeta, asyn) are induced. The second cellular system is MEF cells in which either WT or mutant HOIL-1 are endogenously expressed; in these cells aggregates are formed crudely through disruption of ribosomal translation. It is questionable if both systems can be compared. Aggregate formation is mainly monitored and quantified via fluorescent microscopy in both fixed and live cells, or via sucrose gradient fractionation to separate soluble and insoluble (=aggregate) fractions. The rigidity of protein aggregates is analysed in cells via FRAP, size and circularity measurements and 1,6-HD treatment, or in vitro after aggregate formation assays via size and circularity measurements. The observations are on the whole interesting, though the authors fail to discuss their data in light of previously published work. For example, HOIL-1 KO and KI animals were shown to feature polyglucososan bodies in brain, which is not mentioned. Also, McCrory et al on HOIL-1 chain types is not cited but seems relevant (Figure S4). Yet, the manuscript reports a number of interesting findings, more or less coherently, most useful for scientists embedded in current ubiquitin, autophagy, and LLPS fields. These reviewers believe that this manuscript will make a lot of sense in due course, and be well cited for a first description of the role of HOIL-1 in cellular quality control processes. A number of improvements seem required to consolidate the findings, and improve readability and impact.

      Major:

      1. Figure 1A-C: The authors transiently overexpress either WT HOIL-1L or catalytically inactive (C460A) HOIL-1 in SH-SY5Y cells, then induce and compare the formation of protein aggregates (alpha-synuclein, tau, amyloid-beta) in those cells over 72 h. More cells with aggregates were found in cells that overexpressed mutant HOIL-1L. While these findings are interesting, the cellular system used is artificial due to the transient overexpression of HOIL-1 (in presence of endogenous HOIL-1). Crucial controls are missing:

      a. Adding a condition in which no protein is overexpressed, for example via an empty pcDNA3.1 or GFP only vector. This would help ruling out secondary effects due to the transient overexpression. It would also allow to monitor whether the same amount of aggregates form in the empty ctr compared to when WT HOIL-1 is overexpressed.

      b. Figure legends and raw data points (?) in graphs do not match. The graphs show dubious statistics from 2-3 grey dots, while the figure legend refers to n=100 cells etc. This needs to be fixed.

      c. Showing Western Blots of HOIL-1, to better understand the levels of endogenous HOIL-1 vs overexpressed HOIL-1 in these cells, and to compare overexpression levels between WT and mutant HOIL-1.

      d. The study would also improve by western blotting and IF staining for other LUBAC components such as HOIP and SHARPIN. Do alpha-synuclein aggregates in both WT and mutant conditions co-localise with the other LUBAC components, and are there any differences between WT and mutants. This would further help strengthening the claims made in Figure S1A: '...suggesting that LUBAC is recruited to or retained within α-Synuclein aggregates.' And in the discussion: 'we found that LUBAC components were sequestered in aggregates, as evidenced by microscopy and gradient fractionation of soluble and insoluble proteins, confirming the direct involvement of LUBAC in aggregate processing.' 2. Figure 2A-F: The authors change to a genetic-derived system (comparing endogenously expressed HOIL-1 WT with mutant HOIL-1 based on MEF cells from their mouse models). However, they use puromycin to produce aggregates from random protein homeostasis defects, which yes leads to aggregates, but is not as nice as the induced generation of neiurodegeneration-relevant aggregates. It was observed that after 2 h of puromycin treatment, cells accumulate p62-positive protein aggregates, and in during recovery (2 h washout), the aggregates in the HOIL1 mutant cells outgrow the aggregates in the WT HOIL1.

      a. However, the authors claim that: 'While Hoil-1+/+ MEFs efficiently cleared puromycin-induced p62 bodies,...', which is not supported by the data shown here. When comparing WT in panel C with WT in panel E, it becomes evident that the average number of p62 puncta before and after recovery is the same (around 5 puncta/cell in both pre and post washout conditions). A similar observation can be made for the mutant (around 12 puncta/cell in both pre and post washout conditions). Can the authors please amend their claims, or comment and perform a direct statistical comparison between the pre and post recovery conditions to test for clearance of p62 puncta in the WT after puromycin washout.

      b. The authors state that: 'These findings indicate that although HOIL-1 catalytic activity is dispensable for the initial formation of puromycin-induced aggregates, it is essential for their subsequent clearance.'

      As long as clearance of p62 bodies in the WT is not clearly shown, the second part of this sentence should be amended/removed.

      c. The experiments shown would improve by adding a t = 0 condition. How many p62 granules are present before puromycin treatment? Is there already a basal difference between WT and mutant HOIL-1L cells? 3. Figure S1A: The authors claim that other LUBAC components co-localise to protein aggregates, based on sucrose gradient fractionation and the presence of the respective proteins in the insoluble fractions. Could the authors perform IF and stain for other LUBAC components (SHARPIN and/or HOIP) in their MEF cell system to directly validate this claim? 4. Figure 3G-H: The authors created a GFP-mCherry-p62 reporter system in both their WT and mutant HOIL-1 MEF cells and performed live cell imaging following puromycin treatment, which allows monitoring of both aggregate formation and loss of GFP signal due to the acidic lysosomal localisation. Excitingly, the ratio of GFP/mCherry in the later timepoints is reduced in the WT compared to mutant HOIL-1, indicating that HOIL-1 activity is required to traffic p62 bodies to lysosomes.

      a. In panel G, a surprisingly large amount of p62 granules are present at t0, which (according to the relevant method section) is the time of puromycin treatment. This observation can be made for both WT and mutant cells. After 80 min of puromycin treatment in the WT, the majority of these puncta are cleared. Can the authors please comment on this high amount of p62 granules at t0 (before the effects of puromycin? And also on the observation that after 80 min there are now less granules than before puromycin? In case that t0 indicates the time of puromycin washout rather than puromycin addition, could this please be clarified in the methods or figure legend?

      b. Panel H would improve by adding the quantifications for t=0 (or ideally for all the time points).

      c. Fig S3C-D: Same comments as before but for GFP-mCherry-LC3. 5. Fig. S4B and Fig 4A-B: The authors state that circular aggregates are more soluble and have more LLP characteristics, whereas non-circular aggregates are less soluble and have more aggregate-like characteristics. However, the aggregates shown in Fig. S4B are un-circular but easily dissolve in response to 1,6-HD treatment, which seems contradictory. On the other hand, the aggregates shown in Fig 4B in HOIL-1 mutant cells appear much rounder than the ones in S4B, but do not dissolve in response to 1,6-HD treatment. Can the authors please comment on these discrepancies? 6. Fig. 4E-G. Here the authors suddenly switch to an in vitro aggregate-formation assay using mCherry-p62. In-vitro M1-chain reactions with either WT LUBAC or LUBAC with mutant HOIL-1L, together with the respective M1-chain reaction product, are added. This is not clear from the figure, and a schematic, as well as a gel (Coomassie) should be included to show component purity and indicate the biochemical in vitro nature of the experiment. It is good to have this breadth of methods, but does not help in the presentation if all figures look alike.

      a. The key difference to the cellular situation is p62 aggregates are not directly ubiquitinated here, and instead ubiquitin chains are (non-covalently) added to samples. Can the authors please make this important difference clearer in their text? Why not directly ubiquitinate mCherry-p62 via LUBAC (WT vs mutant HOIL-1) and then perform an aggregation assay on the reaction product?

      b. Can the authors please clarify whether the reaction was inactivated prior to addition to the aggregate-formation assay? If not, the enzymes might still be active at the point of aggregate formation, and the observed effects might be influenced by enzymatic activities and not only the presence of different M1-chain architectures. 7. Fig. 5B-C: The M1-specific DUB OTULIN is knocked down (again, cells) to increase the overall amount of M1-linked Ub-chains present in cells. P62 aggregate formation is induced and the authors claim that the increase in M1-chains influences aggregate size. This claim would be strengthened if it was directly shown that M1-chains form on p62 aggregates in this assay, for example via IF using an M1-antibody (and potentially a total ubiquitin antibody). This would also enable to directly compare the abundance of M1-chains between conditions (ctr vs Otulin, WT vs HOIL-1L mutant).

      Minor:

      1. Figure 1 D: The authors state that 'Notably, HOIL-1 C460A was detected within these structures, as demonstrated by its colocalization with tau aggregates' and show a co-localisation comparison between WT and mutant HOIL-1L. This sentence implies that WT HOIL-1 was not detected in aggregates, however the chosen image of the WT cells does not show any obvious tau aggregate, even though aggregates were induced in this condition according to 1A-C. The better comparison would be to pick an image that includes a tau aggregate. Moreover, this experiment would benefit from quantification, calculating the percentage of total aggregates that co-localise with WT HOIL-1L vs with mutant HOIL-1L.
      2. Figure S1A: The authors state that 'Cells expressing HOIL-1 C460A displayed a pronounced accumulation of high-molecular-weight α-Synuclein species in the insoluble pellet fraction.' While the difference seen by Western Blot is apparent and seems to match Fig 1 A-C, it is not very strong. Moreover, the relevant comparison (WT vs mutant) is made between two different blots/membranes, and it is difficult to assess equal input solely based on the TCL lane. This experiment could be improved by normalising samples (for example via BCA) and by loading and imaging the two conditions (or at least the TCL and pellet samples) on the same membrane. The authors also state that: 'LUBAC subunits, HOIP, HOIL-1, and SHARPIN, were also enriched in this fraction, suggesting that LUBAC is recruited to or retained within α-Synuclein aggregates.' Both Sharpin and HOIL-1 seem to be present in similar levels in the pellet fractions of WT and mutant HOIL-1. Overall HOIP levels seem to be significantly increased in the mutant over the WT (see TCL lane), and to a similar level in the pellet fraction. It would be great if the authors could include these observations in their interpretation.
      3. Figure 2G: Similar to before, this experiment would improve if the authors could find a way to normalise samples between conditions prior to sucrose gradient fractionation or have the most relevant samples on the same blot. It is challenging to properly interpret the results while the bands in the total cell lysate (TCL) lane do not have similar intensities between samples. A blot in which only the TCL and the pellet samples of all conditions were loaded onto the same gel would solve this, allowing for a better comparison between conditions. Based on what is shown in panel G, the authors should amend their claim: 'Consistent with microscopic observations, denser fractions from Hoil-1C458A/C458A MEFs contained increased signals of p62 specifically during the recovery phase (Figure 2G).' It is not apparent that more p62 is present in the insoluble fractions of mutant HOIL-1 cells after puromycin treatment. The band intensities look very similar (This is different for the recovery condition, which shows a strong difference, as stated).
      4. Figure S2: Similar to before, the authors induce protein aggregate formation and compare cells endogenously expressing WT vs mutant HOIL-1L. The size of aggregates increases in mutant cells under proteotoxic stress. What happens to the number of aggregates per cell in these conditions? Does it also increase, or is it just the size?
      5. Fig. S4A: Here the authors analyse the circularity of p62 aggregates in HOIL-1L mutant cells after recovery from puromycin treatment. This experiment would improve if the same analysis could be performed for the WT cells and for the pre-recovery condition (under the condition that large enough granules are present), allowing to make a comparison between WT and mutant, as well as between pre- and post-recovery.
      6. Fig. 4H-J: The authors use a p62 mutant that is known for its enhanced ubiquitin affinity, repeat Fig. 4E-G and state: 'These aberrant condensates were similar to those observed in a reaction using wild-type p62.' Can the authors please comment on what they conclude from this similarity and why this experiment was performed? A quantitative comparison (condensate size and circularity) between WT p62 and mutant p62 may further be useful here.
      7. Fig. 5A: Here the authors pulldown M1-linked Ub-chains in WT vs mutant HOIL-1 cells, with or without puromycin-induced aggregate formation. More M1-chains are observed in mutant HOIL-1 cells under puromycin treatment, but the difference is very subtle. The conclusions drawn from this experiment could be strengthened by including alternative methods, for example (if available) Ub-AQUA to measure the abundance of M1-chains, or using the M1-antibody for IF analysis.
      8. Fig. S5B: The described differences between puromycin treatment and untreated conditions are extremely subtle on the anti-Ub blot, and absent in the anti-Met1 blot. I recommend that the authors remove this sentence, based on the shown data: 'we observed a modest increase in the signal of Met1-linked ubiquitin chains after puromycin treatment'.
      9. Fig. 5F-G: The authors went back to their more-artificial system from Fig 1, in which HOIL-1L was transiently overexpressed (WT or inactive mutant) in SH-SY5Y cells, alongside alpha-synuclein aggregate formation.

      a. The claims made from this experiment would be stronger in the other cell system. Could OTULIN be transiently overexpressed in the MEF cells, to monitor the effect of aggregate formation and clearance. Again, staining aggregates with the M1-antibody would improve this experiment.

      b. The authors claim that HOIL-1 activity fine-tunes the function of HOIP within LUBAC (from discussion: 'This regulatory mechanism ensures that in the presence of functional HOIL-1, the overall quantity and potentially the architecture of Met1-linked ubiquitin chains are tightly controlled'). What is the quantity and architecture of M1-chains catalysed by LUBAC when HOIL-1 is very highly abundant, as it would be the case in this cellular overexpression system? 10. Sentence structure: 'A comprehensive understanding of how Met1-linked ubiquitination, particularly through intricate regulation by Linear Ubiquitin Chain Assembly Complex (LUBAC) components, such as HOIL-1, influences aggregate dynamics and clearance; therefore, it is crucial to develop targeted therapeutic strategies against neurodegenerative proteinopathies'.

      Significance

      The conclusions drawn from this study are very intriguing and give LUBAC (and HOIL-1) a so far unrecognised role in the clearance of protein aggregates, which are a hallmark of several neurodegenerative diseases for which there are currently no cures. Some of the findings described in this manuscript have the potential to be of very high impact and interest to a broader community, in particular researchers interested in protein homeostasis, autophagy, ubiquitin biology and neurodegeneration. In fact, those findings might even expand to autophagic pathways that target other cargo than protein aggregates. Both the novelty aspect and the potential for translational/therapeutic applications comprise the major strength of this manuscript. However, multiple of the presented experiments are currently lacking crucial controls, show weak effect sizes or were performed in artificial settings that likely do not represent relevant in vivo conditions, overall weakening or not fully supporting the claims made. Consequently, further experiments, data re-analyses and validations were recommended to fully support all the claims made here.

      This review was written from the perspective of a researcher in the ubiquitin field.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors show that the branching activity of the E3 ligase HOIL-1, a component of the LUBAC complex, contributes to the autophagic clearance of p62 bodies and protein aggregates. This activity is attributed to enhanced linear, unbranched ubiquitin chain formation by the second E3 ligase of LUBAC, called HOIP. The model systems employed are cell lines including MEFs expressing a catalytically dead version of HOIL-1. In addition, the authors perform in vitro reconstitution experiments with purified ubiquitin chains, the LUBAC complex and p62. The main message is that solid p62 bodies are poor substrates for autophagy and that linear, non-branched ubiquitin chains promote solidification. The mechanism remains unclear and some of the effects sizes are rather modest.

      Major comments:

      The key observations mentioned above are convincingly shown. Since the authors don't claim any detailed molecular mechanisms, the number of conclusions in this study are limited.

      Overall, the authors are quite careful regarding their conclusion, and therefore the ones that are made in this manuscript are generally well supported. The data regarding the clearance of the p62 bodies presented in Figure 3 should be backed up with additional data. The authors could add a macroautophagy inhibitors such as VPS34 IN1 and/or perform the clearance experiments in a ATG KO/KD cell line to corroborate the contribution of macroautophagy to the clearance. In addition, a proteasome inhibitor should be used for comparison.

      The expertise and resources for the experiments mentioned above are expected to be well within the authors' capacity and should be doable within a few weeks.

      Some of the effects sizes (e.g. Fig. 5 and S5) are very small and it is possible that some of them are below statistical significance if the number of replicates are increased.

      Minor comments:

      Figure 1D should be quantified, for example using PCC, Pearson correlation coefficient. Figure S1 should be quantified. Figure S3: It should be explained how the region for the profiles are shown were selected.

      It is suggested to include a scheme of the LUBAC complex and its E3 ligase activities in Figure 1A. This will make it more accessible for readers, who are not so familiar with this complex, in particular as HOIP and HOIL can be easily confused. The authors may also want to clarify this in the abstract.

      Significance

      As mentioned in the summary. The authors report the observation that an excess of linear ubiquitin chains produced by HOIP in the absence of HOIL-1 activity results in the solidification of p62 bodies and reduced clearance by autophagy. This observation is novel and will be interesting for the proteostasis field.

      This reviewer is expert in autophagy and protein degradation, but less so in the LUBAC complex.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Background

      GWAS analyses carried out several years ago identified over 40 genetic loci associated with increased risk of developing Alzheimers and other types of dementia. They included loci encoding the E3 ubiquitin ligase HOIL-1 and the protein SHARPIN, two of the three components of the Linear Ubiquitin Assembly Complex (LUBAC) (Bellenguez et al, 2022, cited in the paper). The third component of LUBAC, HOIP, is the only E3 ubiquitin ligase known to catalyse the formation of Met1-linked ubiquitin (also known as linear ubiquitin). HOIL-1 is one of the few E3 ubiquitin ligases that attaches ubiquitin to serine and threonine residues in proteins forming ester bonds (Kelsall et al. 2019) and has been reported to restrict the HOIP-catalysed formation of Met1-linked ubiquitin (Kelsall et al. 2019; Fuseya et al. 2020; Rodriguez Carvajal et al. 2021).

      Summary of Paper's findings:

      In this study the authors report that HOIL-1 catalytic activity prevents neurodegenerative protein aggregation (synuclein, tau, A) in the human SH-SY5Y neuroblastoma cell line or in mouse embryonic fibroblasts (MEFs) expressing a catalytically inactive mutant of HOIL-1. They argue that this is achieved by maintaining the dynamic, liquid-like properties of protein condensates through regulation of Met1-linked ubiquitin chain levels, thereby facilitating efficient clearance via the aggrephagy pathway. They report that loss of HOIL-1 activity leads to excess Met1-ubiquitylation that drives the transition to rigid, solid-like aggregates resistant to autophagic degradation. In support of this conclusion, they also report that the siRNA knock-down of Otulin, a deubiquitylase that hydrolyses Met1-linked ubiquitin specifically, produces the same effect . The reframing of HOIL-1 as a key factor for fine-tuning ubiquitylation to maintain cellular protein homeostasis is an interesting development and the paper is generally well-written, focused and concise. Further work is required however, to fully convince these reviewers that the effects observed are entirely attributable to excess Met1-linked ubiquitylation, as claimed.

      Major comments:

      1. The causal link between elevated Met1-linked chains and solid-like aggregates in cells is the central claim of the paper. Throughout the study the authors use inactive HOIL-1 to enhance aggregate formation, which they attribute to increased Met1-linked ubiquitylation, something observed by themselves and others previously (Kelsall et al. 2019; Fuseya et al. 2020; Rodriguez Carvajal et al. 2021). However, the immunoblot for Met1-linked ubiquitin (Fig 5A) is not very convincing. In addition, the authors have not excluded the possibility that the loss of HOIL-1 enzyme activity has other effects on ubiquitylation, such as a change in the architecture of the ubiquitin chains caused by the absence of HOIL-1 catalysed formation of oxyester linkages. Many/most ubiquitin chains formed in cells contain more than one ubiquitin linkage type. It is therefore important for the authors to perform immunoblots for other ubiquitin linkage types, such as Lys63-linked ubiquitin, and to include these results in Fig 5.
      2. The reviewers also think that the authors' claims that the transition of condensate property is linked to elevated Met1-linked ubiquitin chains would be strengthened by performing the biophysical assays (FRAP and 1,6-hexanediol resistance) after Otulin knockdown/knockout (and ideally also with Otulin rescue). This will provide direct biophysical evidence linking Met-1 linked chain elevation to condensate liquidity and 1,6-HD sensitivity.
      3. The authors have not shown any evidence that Met1-linked chains are more enriched at the sites of protein aggregation. Would the authors be able to demonstrate direct spatial colocalization of Met1-Ub with the analysed aggregates?
      4. Do the authors know if the effects that they are seeing are general effects on autophagy? For example, is starvation-induced autophagy similarly impaired in the cells studied? A simple flux-style experiment looking at LC3-II levels and p62 with starvations vs puromycin (-/+ bafilomycin) would be informative here.

      Minor comments:

      1. The authors show that loss of HOIL-1 catalytic activity causes p62 bodies to transition from dynamic liquid-like states to rigid solid-like states and claim this as a more general effect on protein aggregates. But the study does not directly demonstrate a liquid-to-solid transition for the disease-relevant α-synuclein, tau, or Aβ aggregates, limiting the generalisation of the claim beyond p62 bodies. Perhaps the authors should modify the text to better reflect this (or, even better, consider treating α-synuclein/tau/Aβ aggregates with 1,6-hexanediol to measure the response). [optional]
      2. Given that the blots presented in Fig S1A appear to come from different membranes, and high-molecular-weight species of α-synuclein seem to exist in the insoluble pellet fraction of both WT and C460A expressing cells, the reviewers would caution against concluding anything about differences, which can only be assessed if the samples are run side-by-side on the same gel.
      3. The Methods section says that two different total ubiquitin antibodies were purchased, but which one was used in Figure 5 and other figures are not stated. Please clarify.
      4. On page 10 ABIN1 is mentioned but it is not mentioned that it is the protein product of the TNIP1 gene that is mentioned in the Introduction. This will confuse to many readers.
      5. 1st paragraph of Discussion line 5 from bottom:- change "oof" to "of".

      Significance

      Prior research has established that the components of LUBAC are recruited to, and are components of, protein aggregates. A link between LUBAC and selective autophagy has also been established previously. The significance of this paper is that it identifies the catalytic function of HOIL-1 as a brake on the activity of LUBAC in proteostasis. The reviewer and co-reviewer are not experts in autophagy or aggregate formation in dementia but, if those reviewers who are find the data presented in these areas to be convincing, then this paper may be the first to suggest a molecular mechanism by which polymorphism/mutation of HOIL-1 leads to increased formation of the aggregates observed in Alzheimer's and other dementias. The results presented in the paper also suggested that initial autophagosome recruitment to aggregates is intact but subsequent late-stage autophagy is impaired. Hence, the study begins to identify the specific step that fails. However, as the authors themselves acknowledge, validation of these potentially exciting findings using in vivo models of neurodegeneration should be the aim of future studies. The paper combines the molecular dissection of ubiquitin and autophagy pathways to understand the causes of neurodegenerative disease. The paper will therefore be of interest to a broad audience, encompassing both the basic research and clinical research communities.

      Reviewers field of expertise: Biochemists and cell biologists with an interest in ubiquitin and cell signalling.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility, and clarity (Required)):

      Summary: In this manuscript, the authors examine how peripherin-2 (PRPH2) contributes to the localization of CNGβ1 within rod outer segment structures. PRPH2 and its homolog ROM1 are structural components of rod discs and are required for disc morphogenesis. In the absence of PRPH2, rod outer segments do not form, and various outer segment materials accumulate and are released as cilia-derived ectosomes. PRPH2 is thought to be transported through an unconventional secretory pathway, whereas cGMP-gated channels follow a conventional trafficking route. Although these components reach the outer segment through distinct pathways, PRPH2 is necessary for the proper delivery of CNGB1, a subunit of the cGMP-gated channel, to its correct destination. It was previously reported that a small fraction of PRPH2 reaches the outer segments through the conventional pathway when it forms a complex with Rom1 in mouse photoreceptors. Using Rom1 KO mice, the authors show that this conventionally trafficked PRPH2 fraction is not required for CNGB1 transport to the outer segment. Using various chimeric constructs, the authors verified that tetraspanin core of PRPH2, delivered to the OS, is sufficient to promote OS localization of CNGB1. Ct and Nt cytoplasmic regions of PRPH2 are dispensable for the role. Overall, the majority of the experiments are well-executed with statistical rigor, written in a way that others can reproduce, and support the major conclusion indicated in the title, "PRPH2 is essential for OS localization of CNGB1".

      Major comments: I believe that the majority of the conclusions are well-supported in this manuscript. Below, I am listing the major points that may need additional experiments or clarifications: 1) CNGA1 subunit is transported to and enriched within ciliary exosomes or the outer segment in PRPH2 deficient mice (Figure 1). The reduced levels of CNGA1 and CNGB1 in rds-/- mice suggest limited stability of these proteins. Their diminished abundance is also influenced by decreased mRNA expression of the corresponding genes. These findings imply that CNGB1 may not be essential for outer segment delivery of cGMP-gated channels if CNGA1 alone contains adequate targeting information. Related to these points, it is unclear whether CNGB1 exhibits a trafficking defect or encounters other problems before leaving the endoplasmic reticulum. Such problems may involve deficiencies in folding, holo-channel assembly, or related quality control processes.

      RESPONSE: We agree with this reviewer and have added additional data and interpretation to address this point. Our new data finds that in fact a low level of CNGB1 can reach ectosomes in rds-/- rods, which makes sense since we and others had observed CNGA1 was present and we know that channel assembly occurs in the ER. This suggests that the CNG channel can properly fold and assemble. Furthermore, overexpressing CNGB1 did not restore ciliary localization in Rds-/-, leading to our interpretation that in the absence of an outer segment membrane compartment, there is no place to deliver the CNG channel and it is subsequently degraded. Apart from perihperin’s binding partner, ROM1, this is unique to the CNG channel. CNG channel subunits are still significantly lower at P21 than other outer segment membrane proteins, such as ABCA4 (shown here), rhodopsin, and PCDH21(shown elsewhere).

      2) CNGB1 overexpression in rds-/- mice does not result in outer segment localization of CNGB1 channels (Figure 2A). These findings do not clarify whether CNGB1 successfully transits through the Golgi apparatus or associates properly with CNGA1 subunits. Elevating expression levels alone would not compensate for problems in folding or assembly.

      RESPONSE: We recognize that our previous submission lacked clarity on this point. Therefore, we have restructured the order of figures and provided additional controls to improve our manuscript. First, the fact that CNG channel is present at P21 and even increases over time suggests that in rds-/- rods channel processing (folding and assembly) is unaffected. Second, we recognize that channel stoichiometry is important for proper channel assembly, so we added a new supplementary figure that shows endogenous CNGA1 expression increases in rds-/- rods that are overexpressing myc-CNGB1 and FLAG-peripherin-2. This adds credence to our CNGB1 overexpression experiments and shows that CNGB1 being trapped is not due to inefficient channel assembly.

      3) Claims related to Figure 6 (P45 rds-/-) need further evidence. It remains uncertain whether CNGA1 and CNGB1 are delivered to lamellar ciliary membranes or to a distinct plasma membrane compartment comparable to that observed in wild type rod outer segments, or whether they accumulate in ciliary ectosomes. Those lamellar structures could be a part of cone outer segments. The observed GARP signal may originate solely from soluble GARP proteins. It is also unclear if CNGA1 and ROM1 colocalize in P45 rds-/- mice. Clarifying these points would strengthen the conclusion that lamellar formation, rather than specific function of PRPH2, is sufficient for CNGB1 delivery to the cilium or outer segment plasma membrane.

      RESPONSE: CNGA1/B1 are not expressed in cones, so the elevated outer segment localization observed at P45 must be coming from rods. In mouse retina, cones make up only 3% of the photoreceptor population. The SEM data clearly show that the lamellar ciliary protrusions are present on the majority of the photoreceptors. We now include CNGB1 staining from Rds-/- P45 sections that corroborate these data and show that CNGB1 is present at P45 and not P21 (Supplemental Figure 2).

      Below are minor comments: 1) The study does not establish whether a direct interaction between PRPH2 and CNGB1 is required for CNGB1 delivery to rod outer segments. Prior work by the senior author (ref 13) suggests that this interaction is not essential, since the PRPH2 binding site within the GARP domain is distinct from outer segment transport signal of CNGB1. Including a discussion of the PRPH2-GARP (or CNGB1) interaction and its relevance to CNGB1 trafficking would help readers interpret the findings more fully.

      RESPONSE: We have included this in our discussion.

      2) The authors propose that the ROM1 core is sufficient for outer segment delivery of CNGB1 based on experiments with chimeric constructs. However, in Figure 1, ROM1 is present in the outer segments (or ciliary ectosomes) of rds-/- mice even though CNGB1 is not delivered to these structures.

      RESPONSE: Our new data, including MS analysis and Western analysis from an enriched ectosome preparation, reveal that, along with ROM1, low levels of the CNG channel are delivered to ciliary ectosomes in Rds-/- mice. However, at this early timepoint photoreceptor cilia do not produce a membrane protrusion, which we observe is required to augment CNG delivery. We expressed a FLAG-ROM1 construct to try to drive earlier creation of these membrane protrusions, but this was unsuccessful, as we observed ROM1 was primarily localized to the inner segment. This suggests that overexpression of ROM1 did not increase ROM1 delivery to the cilia. Luckily, we were able to overcome this bottleneck with several of our chimeric ROM1/Prph2 constructs that did localize to the cilia and restore CNG localization. All of these new results have been included in the revised manuscript.

      3) Line 80: "Theouter" A space shall be inserted between "The" and "outer".

      RESPONSE: Done

      **Referee cross-commenting**

      Both reviewer #2 and reviewer #3 express views that align with mine. They clearly described the study's limitations, and their comments are highly valuable.

      Reviewer #1 (Significance (Required)):

      Prior studies showed that CNGB1 is not present in cilia-derived ectosomes of rds-/- mice, indicating that PRPH2 is necessary for ciliary or outer segment localization of CNGB1 in rods. Building on these earlier findings, I consider this study significant for the following reasons: 1) Using detailed analysis of different PRPH2 domains and chimeric constructs, it clarifies that PRPH2 core region, delivered to OSs, is essential and sufficient for OS localization of CNGB1. 2) PRPH2 and CNGB1 are thought to travel through different post-ER transport routes, with one pathway bypassing Golgi regions and the other passing through them. This study shows that CNGB1 depends on PRPH2, which suggests that these two routes may converge or interact at later stages and opens new directions for future investigation. 3) The study is relevant to basic scientists and biologists investigating how membrane structures acquire specialized functions in neurons, and its implications extend beyond photoreceptor biology.

      Limitation of the study: I believe that clarifying these points will make the manuscript more significant. 1) Is it not clear, as mentioned above, how PRPH2 contributes to the delivery of CNGB1 to the OSs in the different secretory pathways.

      RESPONSE: In the absence of ROM1, Prph2 only travels through the unconventional secretory pathway directly from the ER. By looking at CNG trafficking and localization in ROM1-/- mice, we rule out the possibility that the small portion of PRPH2/ROM1 complexes that traffic conventionally through the Golgi are required for channel localization (Figure 3). Further, our Rho-Prph2 chimera that includes the trafficking signal from Prprh2 did not rescue CNGB1 localization (Figure 4). These findings suggest that it is unlikely that these proteins engage during secretory transport to the outer segment.

      2) The prior study using a fluorescence complementation approach (Ritter et al, 2011) suggests that PRPH2 and CNGB1 can associate within rod ISs, likely before their delivery to OSs. However, it remains unclear whether this interaction supports the potential cotransport of CNGB1 and PRPH2 or whether the authors view these proteins as being transported independently.

      RESPONSE: As described above, our experiments rule out the notion that co-transport through the Golgi is driving CNG channel ciliary localization. We now note in our discussion that this data does not rule out the possibility of an earlier association between these proteins. However, the bulk of our data supports that any early interaction is not required for ciliary delivery.

      3) At the end of the result section (Figure 6, rds-/- P45), the authors suggest that lamellar formation (evaginations?) is required for CNGB1 transport. However, CNGB1 is normally not seen in evaginations or lamellar structures, and thus the assumption is not consistent with prior findings.

      RESPONSE: Absolutely, we agree that the CNG channel does not enter newly forming disc membranes, which has been shown by multiple groups. We included this in our discussion and have now added a clearer statement of our hypothesis: “Together, these data suggest that the partitioning of disc membranes from the plasma membrane by tetraspanin proteins is a key step for localizing the CNG channel and could play a role in segregating other proteins into the plasma membrane.”

      Overall, the manuscript is insightful and has the potential to advance our field and related disciplines.

      RESPONSE: Thanks!

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Cyclic nucleotide gated channels (CNG) localize to the plasma membrane of the rod photoreceptor outer segments, and are a key component of the phototransduction cascade. Understanding how outer segment proteins are trafficked and sequestered to the outer segments is an important field of investigation as it addresses both a fundamental aspect of cell biology and mechanism of disease, many of which have trafficking defects at the core of the pathogenic process. Using primarily IHC analysis of rodent models in combination with introduction of various expression constructs to the retina (through electroporation), this study finds that two rod outer segment structural proteins, peripheral-2 and ROM1, facilitate CNG channel localization to the outer segment.

      While this conclusion is interesting, a major concern that tempers enthusiasm is that in peripherin-2 null photoreceptors, there are no outer bona fide segments. In lieu of outer segments, there are rudimentary membranous protrusions and vesicles distal to the connecting cilia where outer segments should be. So the basis for concluding that peripherin-2 is required for CNG localization to the outer segment seems a bit wobbly. It is understood that the authors assumed the membranous materials distal to cilia as proxy for outer segments in their analysis and narrative. This assumption may have some merits. However, it is well known that when outer segment morphogenesis is severely compromised, all normally outer segment-bound proteins are ectopically localized or largely absent due to increased degradation. This could be simply due to the loss of their destination compartment, among other things. It is not clear how the authors could distinguish between a direct causal relationship where loss of one protein leads to the mislocalization of another, from secondary outcomes due to loss of the outer segments. The last sentence of the Abstract is telling. "Interestingly, this notion is supported by endogenous staining of CNGB1, which reappears in aged Rds-/- rods that have produced ciliary membrane protrusions." So in aged mice CNGB1 did localize to the OS, but what changed? There was more OS like material to house the CNGB1 protein in the aged mice.

      RESPONSE: We agree that the loss of the OS compartment is likely driving downregulation of all OS proteins and have included a statement as such in our manuscript. We also performed additional qRT-PCR analysis on ROM1 and ABCA4 to show global downregulation at the mRNA level – consistent with the notion that there are reduced outer segment proteins when morphogenesis is compromised. However, our Westerns and IHC (as well as published data) clearly find a specific decrease in the CNG channel at the protein level, suggesting that not all proteins behave similarly when the outer segment is not formed. We included additional discussion on this point as well. While not directly examined in our manuscript, previous reports have shown the reverse effect: some outer segment proteins (e.g. PCDH21, Prom1) are upregulated in rds-/- retinas (Rattner et al JBC 2004). Therefore, it is an oversimplification to state that all outer segment proteins behave the same when outer segments are not formed properly. Other models of outer segment dysmorphia (e.g. RhoKO, PCDH21KO, Prom1KO, or WASF3) localize the CNG channel properly. We have added this to the discussion and hope that by restructuring our manuscript, we clearly outline that we do think that membrane retention at the tip of the cilia is driving CNG channel localization and that molecularly the tetraspanin proteins play a role in organizing these membranes.

      Reviewer #2 (Significance (Required)):

      Trafficking of nascent proteins to the outer segment in support of its renewal is an important subject, which has significant impact in understanding the mechanisms of retinal degeneration. The conclusion from this study, that peripherin-2 and ROM1 have a direct role in supporting CNG subunit trafficking may well be meritorious. However the data presented are less than fully convincing, and specifically the question of a direct vs secondary effect needs to be better addressed.

      RESPONSE: We appreciate this reviewer’s enthusiasm for investigating this process. The initial premise of our study was to investigate whether a direct effect of peripherin-2 on CNG delivery was possible, which was meritorious based on previously published data. However, we now find no direct trafficking link between CNG and peripherin-2; instead, our data largely find that CNG delivery is dependent on the presence of retained membranes at the ciliary tip – either through natural mechanisms or by driving “rudimentary” outer segment membrane lamination by overexpression of tetraspanin domains. We have restructured the manuscript to help guide the discussion.

      The following quote underpins some of the reasoning in the study. Lines 139-144, "(Figure 2A). This localization pattern suggests that the CNGB1 subunit is trapped in the biosynthetic pathway. In contrast, when FLAG-tagged rhodopsin is overexpressed in Rds-/- rods it traffics properly to outer segment ectosomes (Figure 2B, (19)). We posit that without proper exit from the biosynthetic pathway, the endogenous CNGB1 protein is rapidly degraded to undetectable levels, which we circumvent through overexpression. These data suggest the localization defect of CNGB1 in Rds-/- rods is in the trafficking of CNGB1. " This in my view is an over- interpretation of limited data. The statement implies that rhodopsin and CNGB1 qualitatively differ in their fate but I would argue that both proteins are heavily degraded intracellularly except more of rhodopsin escaped to the "OS" and shows up in IHC. In many rhodopsin mutant transgenic mice, mutant rhodopsin appeared in OS even though intracellular degradation (gumming up the system) is a major factor in the disease process. The claim "rhodopsin trafficked properly to outer segment ectosomes" is not grounded in solid data.

      RESPONSE: We do fundamentally agree that the endogenous CNG channel is heavily degraded, which we confirm by overexpressing an exogenous CNGB1-myc and finding it trapped in the biosynthetic pathway. As stated by the reviewer, this localization pattern is in contrast to what we and others have observed for endogenous rhodopsin, and now show for overexpressed FLAG-rhodopsin – that rhodopsin does traffic to the OS ectosomes. By comparing the localization of both endogenous and overexpressed constructs (using the same promoter), we feel that our conclusion is well supported. We appreciate that our wording of “rhodopsin trafficked properly to the outer segment” is misleading, as traffic of membrane proteins in Rds-/- rods is generally affected and not “proper”. Importantly, we follow up this “limited data” with additional experiments showing that at high expression levels, we are unable to drive CNGB1 localization to OS ectosomes unless we co-express with a tetraspanin domain.

      A further minor comment is that the scope of the study appear limited, with no attempted experiments on how these proteins might interact to effect facilitation of trafficking.

      RESPONSE: Our approach was to be agnostic to the outcome of our hypothesis that peripherin-2 was directly involved in CNG channel trafficking. The experiments we performed to test this (ROM1-/- analysis and Prph2 C-terminal chimeras) did not support a role for peripherin-2 in CNG trafficking. Instead, our data support a model in which membrane retention and organization at the ciliary tip drives CNG channel delivery. We feel that our approach was not limited.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      in the gene encoding tetraspanin protein peripherin 2 (Prph2), i.e., Rds-/-, examining the requirements for various portions of the Prph2 protein in the context of an assortment of chimeric constructs expressed via transfection into photoreceptor cells, to restore localization of the beta subunit of the cyclic nucleotide-gated channel (CNGbeta1) to photoreceptor outer segments (OS) (in a small number of experiments) or, in the majority of experiments, to do so for a recombinant tagged version of this protein also overexpressed by transfection.

      The concluding sentences of the Discussion, which summarize the major conclusions are as follows: "Our data clearly show that localization of the CNG channel is dependent upon peripherin-2 after biosynthetic exit, further suggesting that the necessary action is at the ciliary base. Supporting evidence for this comes from analysis of Rhodopsin knockout outer segments which have internal disc-like structures and localize CNG channel properly. Therefore, in the absence of a fully elaborated outer segment, peripherin-2's ability to delineate a disc is sufficient to drive CNG channel delivery. Together, these data suggest that the partitioning of disc membranes from the plasma membrane by tetraspanin proteins is a key step for trafficking the CNG channel and could play a role in segregating other proteins into the plasma membrane.

      The first sentence contains both reasonable conclusions and phrases whose meaning is unclear or not supported by the results presented. The statement: 'localization of the CNG channel is dependent upon peripherin-2 is supported by the data but, of course, has long been known from previous studies of Rds-/- mice. What is meant by "...after biosynthetic exit..." is unclear. If, by this term, apparently newly invented, the authors mean "after its synthesis of the protein is complete," the statement is accurate, but also a truism.

      RESPONSE: The absence of CNGB1 was reported in previous studies, but the mechanism driving its absence has not been investigated. In our resubmission, we have added additional data that now shows CNGB1 is present at very low levels in Rds-/- ectosomes but remains undetectable by IHC, which is consistent with previous studies mentioned by the reviewer, but is also a novel finding. Importantly, we find specific downregulation of CNG channel subunits in Rds-/- retinas compared to ABCA4, supported by Western blot analysis (Figure 1), and we investigate the mechanism driving this result.

      We appreciate the reviewer pointing out that “biosynthetic exit” is a niche term not broadly understood. We have removed this statement.

      The statement, "the necessary action is at the ciliary base," is NOT supported by the data presented, as the effect of the "successful" Prph2 constructs on CNGbeta1 localization is primarily to increase its levels at the distal end of cilia and at the base of OS-related structures formed in response to the presence of the Prph2 constructs. The restoration of these membranes, which, as the authors note, has been previously reported, is overwhelmingly the biggest effect of these constructs, and it could be argued that the restored localization, rather than degradation, of CNGbeta1 is merely a downstream consequence of the formation of these structures, with perhaps, an element of stabilization of CNGbeta1 toward degradation from direct binding to Prph2, which has also been previously reported.

      RESPONSE: We agree with the reviewer. Our interpretation of our data is that the presence of Prph2 (or its variants) at the distal end of the cilia localizes CNGB1, likely due to the formation of outer segment membrane structures. Previous to this work, there was a possibility that targeting information of Prph2 was required for CNGB1. That had never been explored. We definitively rule this possibility out when we express the C-terminal tail of Prph2, which is unable to rescue CNGB1 localization. Because the tetraspanin domain of Prph2 (or ROM1) can localize CNGB1, we do agree that the definition of an outer segment structure is the driving force for CNGB1 delivery – these are new findings. We’ve restructured and added additional discussion to the manuscript to clarify this point.

      The next suggested conclusion is, "Therefore, in the absence of a fully elaborated outer segment, peripherin-2's ability to delineate a disc is sufficient to drive CNG channel delivery," is partly accurate and partly misleading. If the word "localization" were to replace the term, "delivery," concerning which there are no data (aside from those confirming that Prph2 and CNGbeta1 pass through distinct secretory pathways), this statement would be an accurate summary.

      RESPONSE: We have updated to “localization”, but the fact that we confirm these two proteins do not traffic together through the Golgi would suggest that delivery is independent of trafficking.

      The final sentence, "Together, these data suggest that the partitioning of disc membranes from the plasma membrane by tetraspanin proteins is a key step for trafficking the CNG channel and could play a role in segregating other proteins into the plasma membrane," sentence, would also be accurate if the word "localization," were to replace the term, "trafficking." The key point for these qualifications is that the experiments presented measure steady state levels of CNGbeta1 constructs at certain locations, which are determined not only by rates of trafficking, but also rates of synthesis and degradation, and the data presented confirm that total levels of CNGbeta1 are greatly diminished in the absence of functional Prph2, rendering any conclusions about the relative roles of trafficking kinetics and degradation kinetics speculative in nature.

      RESPONSE: We agree and have revised.

      Aside from these major conceptual issues, there is one overriding technical question: why are almost all the experiments presented carried out with a highly over-expressed engineered version of CNGb1 with a tag, which is clearly context far from the physiological one, as opposed to examining redistribution of the endogenous CNGbeta1, which is of much greater interest. In some results relegated to a Supplemental figure (Supp. Fig. 2), the authors clearly demonstrate that sufficient signal can be obtained from immunofluorescence staining the endogenous proteins for such experiments to be readily interpretable. If the concern was cross-reactivity with non-covalently attached GARP proteins, a few experiments showing that similar results are obtained for immunostaining of the endogenous protein or of the tagged construct would haver been sufficient, and the paper could have had more physiological relevance and impact.

      RESPONSE: We agree that endogenous CNG staining is important and valuable, which is why we included it in our manuscript. We were able to confirm that overexpressed CNG recapitulated the endogenous staining. We proceeded with analyzing overexpressed, tagged CNG for the reasons stated by the reviewer. Yes, cross-reactivity with soluble GARP proteins was one consideration, as was the fact that the GARP antibody is a mouse monoclonal antibody. Increased IgG due to inflammation in the RDS-/- model can obscure the outer segment region in these retinas, confounding our quantification. The tagged versions of CNGB1 and corresponding quantification offered the most clarity and continuity for the reader; therefore, we relegate the endogenous staining to the supplement.

      The remaining concerns are generally of less significance and mostly conceptual or quite minor technical concerns. Technically, the imaging data and their quantification are of good quality and analyzed with reasonable rigor.

      RESPONSE: Thanks!

      Abstract: "In this study, we investigate how peripherin-2 is engaged in CNG channel delivery to the outer segment. Might this not be more a question of how the absence of properly formed discs impacts the formation of outer segments with plasma membranes surrounding the disks? Is this really a question of "delivery" or "lack of address to make the delivery"?

      RESPONSE: Our interpretation of this comment is that it boils down to semantics. Delivery is inclusive of both trafficking and localization, which we investigate in our manuscript.

      Page 3, "fluorescence complementation between peripherin-2 and CNGb1 in the inner segment of transgenic Xenopus rods (23) ". The wording is unclear. It should be stated clearly that they are describing results of "bimolecular fluorescence complementation assays" of highly overexpressed recombinant proteins expressed from transgenes.

      RESPONSE: We have revised.

      Page 4, "...trapped in the biosynthetic pathway," It is unclear what the authors mean by this phrase. Obviously, "biosynthesis," i.e., translation is indeed complete, but biochemical pathways are not places. Is the intention to suggest that post-translational processing, such as addition and editing of carbohydrate chains or assembly with the alpha subunit has not been completed? If so, it would be better just to say so clearly. Or, is it meant to imply that it is physically "trapped" in the ER and/or Golgi apparatus? In any case the meaning should be made clear. Co-staining with ER and Golgi markers would have been very informative with respect to the compartments in which the highly overexpressed recombinant protein is trapped.

      RESPONSE: We acknowledge that our phrasing here was indirect. We have revised. Co-staining with Calnexin (an ER-marker) was attempted, but proved to be uninformative.

      It should also be noted that accumulation of highly overexpressed membrane proteins within internal membranes and membrane aggregates is a very commonly observed experimental phenomenon, and not restricted to the highly specialized trafficking routes in photoreceptors.

      RESPONSE: We agree that exogenous expression of membrane proteins can lead to increased presence within internal membranes of the inner segment, which we routinely see in our experiments. Importantly, our analysis is restricted to the ability of these exogenously expressed proteins to reach the ciliary compartment in Rds mice. We also conduct these experiments in wild-type retinas to ensure that our constructs are expressed, and the proteins reach the ciliary outer segment under normal conditions.

      Page 4, " peripherin-2 facilitates trafficking of the CNGb1 subunit to the outer segment " The data presented to this point do not demonstrate an enhancement of transport, but only of steady-state levels. There is nothing to rule out the possibility that some beta subunit is trafficked in Rds-/-, but is unstable to degradation in the region near the cilium when peripherin-2 and outer segments are not available. An increase in transport is certainly a possible explanation for the results, but should not be taken as an unambiguous conclusion.

      RESPONSE: We have altered the description of these results to allow for more interpretation of our data, which show that CNGB1 delivery to the outer segment is reduced in Rds-/- mice and enhanced when peripherin-2 is re-expressed.

      Page 4, " We confirmed that the fraction of peripherin-2 that traffics conventionally through the Golgi is indeed absent in Rom1-/- retinas and found that trafficking of the CNG channel via the conventional pathway is unaffected (Figure 3A) . This is one of the stronger and more interesting results in this manuscript, and tilts the argument against trafficking as being the mechanism for enhancement by overexpressed peripherin-2 of beta subunit levels in the distal region of the photoreceptor layer.

      RESPONSE: We agree.

      Page 5, " Our finding that secretory trafficking of peripherin-2 and CNGb1 is distinct . Clumsy syntax- needs to be rewritten for clarity.

      RESPONSE: Revised

      Page 5, "two previously characterized fusion proteins... have been shown to localize to the outer segment and build a rudimentary membrane structure (19) " This previous result, which is critical to interpretation of the results in this manuscript, should be introduced early, before any experimental results using related constructs are presented, in order to avoid confusion.

      RESPONSE: Prior to these experiments, we used only full-length peripherin-2, rhodopsin, or CNGB1. This paragraph is the first introduction of any chimeric protein, and we explain these two constructs thoroughly. We believe this satisfies this reviewer’s request.

      Page 5, " We confirmed these data by staining for endogenous CNGb1 in Rds-/- rods electroporated with each construct (Supplemental Figure 2B,C) " This is the most informative result in this manuscript with regard to the ability of these constructs to restore proper localization of CNGB1- it is not clear that the overexpression constructs for CNGB1 present any advantage beyond stronger signal and they may not be assumed, a priori, to be faithfully reporting on interactions of Prph2 with endogenous CNGB1, which is the biologically significant question. A big problem with Supp. Fig. 2 is that there is no real control, i.e., one without any Prph2 construct electroporated. Even the Rho-Prph2CT construct has some ROS-related structures and some CNGB1 localized to the one shown at higher magnification. The Prph2-RhoCT construct seems to lead to a substantial increase in endogenous CNGB1 in inner segment membranes. This looks like a phenomenon that is potentially very interesting, although it doesn't fit with any of the models put forth in the manuscript.

      RESPONSE: We agree that endogenous staining (shown in Supplemental Figure 3 of our revised manuscript) is informative, but it was technically challenging. Once we verified that our overexpression system recapitulated results for endogenous CNGB1, we went forward with the epitope-tagged CNGB1, which was clearer when quantifying CNGB1 localization to rudimentary outer segments.

      Our electroporation method provides an excellent internal control, as all of the non-electroporated cells show no endogenous CNGB1 localization without peripherin expression (Sup Fig 3A).

      Page 5, " cytosolic N- and C-termini of peripherin-2 are dispensable for CNGb1 outer segment localization " No- if you could simply remove them and get proper localization, that would show they are "dispensable." In these experiments they are always replaced with the corresponding region of some other protein that is localized to OS, or in one case, with 3 copies of the FLAG tag at the N-terminus. There are also clear differences in the efficacy of the different "successful" constructs, but these results and their implications are not really discussed.

      RESPONSE: We make this statement in the context of these termini being dispensable to CNGB1 localization, not to peripherin-2’s stability, function, or localization. A complete truncation of either domain results in a non-functioning protein. Our supplemental data shows reduced expression with a truncated N-terminus, preventing analysis (Sup Fig 5C). The 3X-FLAG has no known function in the cell, and we believe it serves as a proxy for removing the N-terminus altogether. Removing the C-terminus would prevent proper outer segment targeting, which is key to determining how peripherin-2 impacts CNGB1 ciliary delivery. Replacing this C-terminus with an outer segment targeting domain from another protein is an established method of investigation.

      Page 6, " We then wanted to determine whether the ROM1 tetraspanin region was sufficient to facilitate CNGb1 delivery by further replacing ROM1's cytoplasmic N-terminus with that of peripherin-2 (Prph2NT/CT-ROM1) . " This experiment obviously does NOT test "sufficiency" of the TM segments, as the construct has the termini replaced with the corresponding regions of Prph2, which might functionally substitute for the missing ROM1 regions.

      RESPONSE: Our previous results had already ruled out a role for these termini in CNGB1 localization.

      Page 6, " We show a dramatic increase in GARP staining in the aged Rds-/- retinal sections " The age dependence of this phenomenon is quite interesting and puzzling. Any thoughts on the mechanism?

      RESPONSE: We agree that this natural process is very interesting. We have restructured the order of our figures and provided additional controls to support this finding. We have added this to the discussion and hope that by restructuring our manuscript, we clearly outline that we do think that membrane retention at the tip of the cilia is driving CNG channel localization and that molecularly the tetraspanin proteins play a role in organizing these membranes.

      Page 6, " Although CNGα1, known to form homotetramers, can localize to the extracellular vesicles released into the outer segment area. " Not a sentence.

      RESPONSE: Revised

      Page 6, " Our data now shows that the population of peripherin-2 in complex with ROM1 that travels through the conventional trafficking pathway does not play a role in CNGb1 localization to the outer segment. " This is an oddly accurate, albeit somewhat contradictory sentence. Yes, you have failed to answer the question you claim this work was designed to address. Apart from this negative result, nothing is learned about trafficking, per se, from the experiments in this manuscript.

      RESPONSE: Please see our response to the reviewer’s comment above that clarifies our thinking regarding our results on trafficking.

      Page 7, " anticipated " Hopefully, the authors mean to say, "hypothesized," here.

      RESPONSE: Revised

      **Referee cross-commenting**

      My impression from reading the reviewers' comments is that there is general agreement on both the strengths and the limitations of this work. In my opinion, the issues raised by the reviewers could be addressed by editing the manuscript to be more circumspect in drawing definite conclusions from data that are not fully conclusive, without necessarily adding new experiments.

      Reviewer #3 (Significance (Required)):

      This study addresses a problem of great interest in the photoreceptor field and in cell biology more generally of trafficking and localization of specialized membrane proteins to specialized ciliary membranes. The strengths are technical quality of data with good controls, in most cases. The limitations are largely conceptual in nature and derive from the rather simplistic approach to the experimental design, as described above. The rather dated, "mix and match" approach based on chimeric construct with pieces of sequences removed and replaced at will does not properly account for the conclusion reached many times from many experiments, including some this manuscript, that the "roles" of stretches of amino acid sequence depend exquisitely on the multidimensional context in which they are tested, not simply on their position in the linear sequence. The paper presents interesting and convincing results with respect to functional requirements for formation disc-like membranes, but very little with respect to 'trafficking."

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      in the gene encoding tetraspanin protein peripherin 2 (Prph2), i.e., Rds-/-, examining the requirements for various portions of the Prph2 protein in the context of an assortment of chimeric constructs expressed via transfection into photoreceptor cells, to restore localization of the beta subunit of the cyclic nucleotide-gated channel (CNGbeta1) to photoreceptor outer segments (OS) (in a small number of experiments) or, in the majority of experiments, to do so for a recombinant tagged version of this protein also overexpressed by transfection.

      The concluding sentences of the Discussion, which summarize the major conclusions are as follows: "Our data clearly show that localization of the CNG channel is dependent upon peripherin-2 after biosynthetic exit, further suggesting that the necessary action is at the ciliary base. Supporting evidence for this comes from analysis of Rhodopsin knockout outer segments which have internal disc-like structures and localize CNG channel properly. Therefore, in the absence of a fully elaborated outer segment, peripherin-2's ability to delineate a disc is sufficient to drive CNG channel delivery. Together, these data suggest that the partitioning of disc membranes from the plasma membrane by tetraspanin proteins is a key step for trafficking the CNG channel and could play a role in segregating other proteins into the plasma membrane.

      The first sentence contains both reasonable conclusions and phrases whose meaning is unclear or not supported by the results presented. The statement: 'localization of the CNG channel is dependent upon peripherin-2 is supported by the data but, of course, has long been known from previous studies of Rds-/- mice. What is meant by "...after biosynthetic exit..." is unclear. If, by this term, apparently newly invented, the authors mean "after its synthesis of the protein is complete," the statement is accurate, but also a truism. The statement, "the necessary action is at the ciliary base," is NOT supported by the data presented, as the effect of the "successful" Prph2 constructs on CNGbeta1 localization is primarily to increase its levels at the distal end of cilia and at the base of OS-related structures formed in response to the presence of the Prph2 constructs. The restoration of these membranes, which, as the authors note, has been previously reported, is overwhelmingly the biggest effect of these constructs, and it could be argued that the restored localization, rather than degradation, of CNGbeta1 is merely a downstream consequence of the formation of these structures, with perhaps, an element of stabilization of CNGbeta1 toward degradation from direct binding to Prph2, which has also been previously reported.

      The next suggested conclusion is, "Therefore, in the absence of a fully elaborated outer segment, peripherin-2's ability to delineate a disc is sufficient to drive CNG channel delivery," is partly accurate and partly misleading. If the word "localization" were to replace the term, "delivery," concerning which there are no data (aside from those confirming that Prph2 and CNGbeta1 pass through distinct secretory pathways), this statement would be an accurate summary. The final sentence, "Together, these data suggest that the partitioning of disc membranes from the plasma membrane by tetraspanin proteins is a key step for trafficking the CNG channel and could play a role in segregating other proteins into the plasma membrane," sentence, would also be accurate if the word "localization," were to replace the term, "trafficking." The key point for these qualifications is that the experiments presented measure steady state levels of CNGbeta1 constructs at certain locations, which are determined not only by rates of trafficking, but also rates of synthesis and degradation, and the data presented confirm that total levels of CNGbeta1 are greatly diminished in the absence of functional Prph2, rendering any conclusions about the relative roles of trafficking kinetics and degradation kinetics speculative in nature.

      Aside from these major conceptual issues, there is one overriding technical question: why are almost all the experiments presented carried out with a highly over-expressed engineered version of CNGb1 with a tag, which is clearly context far from the physiological one, as opposed to examining redistribution of the endogenous CNGbeta1, which is of much greater interest. In some results relegated to a Supplemental figure (Supp. Fig. 2), the authors clearly demonstrate that sufficient signal can be obtained from immunofluorescence staining the endogenous proteins for such experiments to be readily interpretable. If the concern was cross-reactivity with non-covalently attached GARP proteins, a few experiments showing that similar results are obtained for immunostaining of the endogenous protein or of the tagged construct would haver been sufficient, and the paper could have had more physiological relevance and impact.

      The remaining concerns are generally of less significance and mostly conceptual or quite minor technical concerns. Technically, the imaging data and their quantification are of good quality and analyzed with reasonable rigor.

      Abstract: "In this study, we investigate how peripherin-2 is engaged in CNG channel delivery to the outer segment. Might this not be more a question of how the absence of properly formed discs impacts the formation of outer segments with plasma membranes surrounding the disks? Is this really a question of "delivery" or "lack of address to make the delivery"?

      Page 3, "fluorescence complementation between peripherin-2 and CNG1 in the inner segment of transgenic Xenopus rods (23) ". The wording is unclear. It should be stated clearly that they are describing results of "bimolecular fluorescence complementation assays" of highly overexpressed recombinant proteins expressed from transgenes.

      Page 4, "...trapped in the biosynthetic pathway," It is unclear what the authors mean by this phrase. Obviously, "biosynthesis," i.e., translation is indeed complete, but biochemical pathways are not places. Is the intention to suggest that post-translational processing, such as addition and editing of carbohydrate chains or assembly with the alpha subunit has not been completed? If so, it would be better just to say so clearly. Or, is it meant to imply that it is physically "trapped" in the ER and/or Golgi apparatus? In any case the meaning should be made clear. Co-staining with ER and Golgi markers would have been very informative with respect to the compartments in which the highly overexpressed recombinant protein is trapped. It should also be noted that accumulation of highly overexpressed membrane proteins within internal membranes and membrane aggregates is a very commonly observed experimental phenomenon, and not restricted to the highly specialized trafficking routes in photoreceptors.

      Page 4, " peripherin-2 facilitates trafficking of the CNG1 subunit to the outer segment " The data presented to this point do not demonstrate an enhancement of transport, but only of steady-state levels. There is nothing to rule out the possibility that some beta subunit is trafficked in Rds-/-, but is unstable to degradation in the region near the cilium when peripherin-2 and outer segments are not available. An increase in transport is certainly a possible explanation for the results, but should not be taken as an unambiguous conclusion.

      Page 4, " We confirmed that the fraction of peripherin-2 that traffics conventionally through the Golgi is indeed absent in Rom1-/- retinas and found that trafficking of the CNG channel via the conventional pathway is unaffected (Figure 3A) . This is one of the stronger and more interesting results in this manuscript, and tilts the argument against trafficking as being the mechanism for enhancement by overexpressed peripherin-2 of beta subunit levels in the distal region of the photoreceptor layer.

      Page 5, " Our finding that secretory trafficking of peripherin-2 and CNG1 is distinct . Clumsy syntax- needs to be rewritten for clarity.

      Page 5, "two previously characterized fusion proteins... have been shown to localize to the outer segment and build a rudimentary membrane structure (19) " This previous result, which is critical to interpretation of the results in this manuscript, should be introduced early, before any experimental results using related constructs are presented, in order to avoid confusion.

      Page 5, " We confirmed these data by staining for endogenous CNG1 in Rds-/- rods electroporated with each construct (Supplemental Figure 2B,C) " This is the most informative result in this manuscript with regard to the ability of these constructs to restore proper localization of CNGB1- it is not clear that the overexpression constructs for CNGB1 present any advantage beyond stronger signal and they may not be assumed, a priori, to be faithfully reporting on interactions of Prph2 with endogenous CNGB1, which is the biologically significant question. A big problem with Supp. Fig. 2 is that there is no real control, i.e., one without any Prph2 construct electroporated. Even the Rho-Prph2CT construct has some ROS-related structures and some CNGB1 localized to the one shown at higher magnification. The Prph2-RhoCT construct seems to lead to a substantial increase in endogenous CNGB1 in inner segment membranes. This looks like a phenomenon that is potentially very interesting, although it doesn't fit with any of the models put forth in the manuscript.

      Page 5, " cytosolic N- and C-termini of peripherin-2 are dispensable for CNG1 outer segment localization " No- if you could simply remove them and get proper localization, that would show they are "dispensable." In these experiments they are always replaced with the corresponding region of some other protein that is localized to OS, or in one case, with 3 copies of the FLAG tag at the N-terminus. There are also clear differences in the efficacy of the different "successful" constructs, but these results and their implications are not really discussed.

      Page 6, " We then wanted to determine whether the ROM1 tetraspanin region was sufficient to facilitate CNG1 delivery by further replacing ROM1's cytoplasmic N-terminus with that of peripherin-2 (Prph2NT/CT-ROM1) . " This experiment obviously does NOT test "sufficiency" of the TM segments, as the construct has the termini replaced with the corresponding regions of Prph2, which might functionally substitute for the missing ROM1 regions.

      Page 6, " We show a dramatic increase in GARP staining in the aged Rds-/- retinal sections " The age dependence of this phenomenon is quite interesting and puzzling. Any thoughts on the mechanism?

      Page 6, " Although CNGα1, known to form homotetramers, can localize to the extracellular vesicles released into the outer segment area. " Not a sentence.

      Page 6, " Our data now shows that the population of peripherin-2 in complex with ROM1 that travels through the conventional trafficking pathway does not play a role in CNG1 localization to the outer segment. " This is an oddly accurate, albeit somewhat contradictory sentence. Yes, you have failed to answer the question you claim this work was designed to address. Apart from this negative result, nothing is learned about trafficking, per se, from the experiments in this manuscript.

      Page 7, " anticipated " Hopefully, the authors mean to say, "hypothesized," here.

      Referee cross-commenting

      My impression from reading the reviewers' comments is that there is general agreement on both the strengths and the limitations of this work. In my opinion, the issues raised by the reviewers could be addressed by editing the manuscript to be more circumspect in drawing definite conclusions from data that are not fully conclusive, without necessarily adding new experiments.

      Significance

      This study addresses a problem of great interest in the photoreceptor field and in cell biology more generally of trafficking and localization of specialized membrane proteins to specialized ciliary membranes. The strengths are technical quality of data with good controls, in most cases. The limitations are largely conceptual in nature and derive from the rather simplistic approach to the experimental design, as described above. The rather dated, "mix and match" approach based on chimeric construct with pieces of sequences removed and replaced at will does not properly account for the conclusion reached many times from many experiments, including some this manuscript, that the "roles" of stretches of amino acid sequence depend exquisitely on the multidimensional context in which they are tested, not simply on their position in the linear sequence. The paper presents interesting and convincing results with respect to functional requirements for formation disc-like membranes, but very little with respect to 'trafficking."

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Cyclic nucleotide gated channels (CNG) localize to the plasma membrane of the rod photoreceptor outer segments, and are a key component of the phototransduction cascade. Understanding how outer segment proteins are trafficked and sequestered to the outer segments is an important field of investigation as it addresses both a fundamental aspect of cell biology and mechanism of disease, many of which have trafficking defects at the core of the pathogenic process. Using primarily IHC analysis of rodent models in combination with introduction of various expression constructs to the retina (through electroporation), this study finds that two rod outer segment structural proteins, peripheral-2 and ROM1, facilitate CNG channel localization to the outer segment.

      While this conclusion is interesting, a major concern that tempers enthusiasm is that in peripherin-2 null photoreceptors, there are no outer bona fide segments. In lieu of outer segments, there are rudimentary membranous protrusions and vesicles distal to the connecting cilia where outer segments should be. So the basis for concluding that peripherin-2 is required for CNG localization to the outer segment seems a bit wobbly. It is understood that the authors assumed the membranous materials distal to cilia as proxy for outer segments in their analysis and narrative. This assumption may have some merits. However, it is well known that when outer segment morphogenesis is severely compromised, all normally outer segment bound proteins are ectopically localized or largely absent due to increased degradation. This could be simply due to the loss of their destination compartment, among other things. It is not clear how the authors could distinguish between a direct causal relationship where loss of one protein leads to the mislocalization of another, from secondary outcomes due to loss of the outer segments. The last sentence of the Abstract is telling. "Interestingly, this notion is supported by endogenous staining of CNGB1, which reappears in aged Rds-/- rods that have produced ciliary membrane protrusions." So in aged mice CNGB1 did localize to the OS, but what changed? There was more OS like material to house the CNGB1 protein in the aged mice.

      Significance

      Trafficking of nascent proteins to the outer segment in support of its renewal is an important subject, which has significant impact in understanding the mechanisms of retinal degeneration. The conclusion from this study, that peripherin-2 and ROM1 have a direct role in supporting CNG subunit trafficking may well be meritorious. However the data presented are less than fully convincing, and specifically the question of a direct vs secondary effect needs to be better addressed.

      The following quote underpins some of the reasoning in the study. Lines 139-144, "(Figure 2A). This localization pattern suggests that the CNGB1 subunit is trapped in the biosynthetic pathway. Incontrast, when FLAG-tagged rhodopsin is overexpressed in Rds-/- rods it traffics properly to outer segment ectosomes (Figure 2B, (19)). We posit that without proper exit from thebiosynthetic pathway, the endogenous CNGB1 protein is rapidly degraded to undetectablelevels, which we circumvent through overexpression. These data suggest the localization defect of CNGB1 in Rds-/- rods is in the trafficking of CNGB1. " This in my view is an over- interpretation of limited data. The statement implies that rhodopsin and CNGB1 qualitatively differ in their fate but I would argue that both proteins are heavily degraded intracellularly except more of rhodopsin escaped to the "OS" and shows up in IHC. In many rhodopsin mutant transgenic mice, mutant rhodopsin appeared in OS even though intracellular degradation (gumming up the system) is a major factor in the disease process. The claim "rhodopsin trafficked properly to outer segment ectosomes" is not grounded in solid data.

      A further minor comment is that the scope of the study appear limited, with no attempted experiments on how these proteins might interact to effect facilitation of trafficking.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: In this manuscript, the authors examine how peripherin-2 (PRPH2) contributes to the localization of CNGβ1 within rod outer segment structures. PRPH2 and its homolog ROM1 are structural components of rod discs and are required for disc morphogenesis. In the absence of PRPH2, rod outer segments do not form, and various outer segment materials accumulate and are released as cilia-derived ectosomes. PRPH2 is thought to be transported through an unconventional secretory pathway, whereas cGMP-gated channels follow a conventional trafficking route. Although these components reach the outer segment through distinct pathways, PRPH2 is necessary for the proper delivery of CNGB1, a subunit of the cGMP-gated channel, to its correct destination.

      It was previously reported that a small fraction of PRPH2 reaches the outer segments through the conventional pathway when it forms a complex with Rom1 in mouse photoreceptors. Using Rom1 KO mice, the authors show that this conventionally trafficked PRPH2 fraction is not required for CNGB1 transport to the outer segment. Using various chimeric constructs, the authors verified that tetraspanin core of PRPH2, delivered to the OS, is sufficient to promote OS localization of CNGB1. Ct and Nt cytoplasmic regions of PRPH2 are dispensable for the role. Overall, the majority of the experiments are well-executed with statistical rigor, written in a way that others can reproduce, and support the major conclusion indicated in the title, "PRPH2 is essential for OS localization of CNGB1".

      Major comments: I believe that the majority of the conclusions are well-supported in this manuscript. Below, I am listing the major points that may need additional experiments or clarifications:

      1) CNGA1 subunit is transported to and enriched within ciliary exosomes or the outer segment in PRPH2 deficient mice (Figure 1). The reduced levels of CNGA1 and CNGB1 in rds-/- mice suggest limited stability of these proteins. Their diminished abundance is also influenced by decreased mRNA expression of the corresponding genes. These findings imply that CNGB1 may not be essential for outer segment delivery of cGMP-gated channels if CNGA1 alone contains adequate targeting information. Related to these points, it is unclear whether CNGB1 exhibits a trafficking defect or encounters other problems before leaving the endoplasmic reticulum. Such problems may involve deficiencies in folding, holo-channel assembly, or related quality control processes.

      2) CNGB1 overexpression in rds-/- mice does not result in outer segment localization of CNGB1 channels (Figure 2A). These findings do not clarify whether CNGB1 successfully transits through the Golgi apparatus or associates properly with CNGA1 subunits. Elevating expression levels alone would not compensate for problems in folding or assembly.

      3) Claims related to Figure 6 (P45 rds-/-) need further evidence. It remains uncertain whether CNGA1 and CNGB1 are delivered to lamellar ciliary membranes or to a distinct plasma membrane compartment comparable to that observed in wild type rod outer segments, or whether they accumulate in ciliary ectosomes. Those lamellar structures could be a part of cone outer segments. The observed GARP signal may originate solely from soluble GARP proteins. It is also unclear if CNGA1 and ROM1 colocalize in P45 rds-/- mice. Clarifying these points would strengthen the conclusion that lamellar formation, rather than specific function of PRPH2, is sufficient for CNGB1 delivery to the cilium or outer segment plasma membrane.

      Below are minor comments:

      1. The study does not establish whether a direct interaction between PRPH2 and CNGB1 is required for CNGB1 delivery to rod outer segments. Prior work by the senior author (ref 13) suggests that this interaction is not essential, since the PRPH2 binding site within the GARP domain is distinct from outer segment transport signal of CNGB1. Including a discussion of the PRPH2-GARP (or CNGB1) interaction and its relevance to CNGB1 trafficking would help readers interpret the findings more fully.
      2. The authors propose that the ROM1 core is sufficient for outer segment delivery of CNGB1 based on experiments with chimeric constructs. However, in Figure 1, ROM1 is present in the outer segments (or ciliary ectosomes) of rds-/- mice even though CNGB1 is not delivered to these structures.
      3. Line 80: "Theouter" A space shall be inserted between "The" and "outer".

      Referee cross-commenting

      Both reviewer #2 and reviewer #3 express views that align with mine. They clearly described the study's limitations, and their comments are highly valuable.

      Significance

      Prior studies showed that CNGB1 is not present in cilia-derived ectosomes of rds-/- mice, indicating that PRPH2 is necessary for ciliary or outer segment localization of CNGB1 in rods. Building on these earlier findings, I consider this study significant for the following reasons:

      1) Using detailed analysis of different PRPH2 domains and chimeric constructs, it clarifies that PRPH2 core region, delivered to OSs, is essential and sufficient for OS localization of CNGB1.

      2) PRPH2 and CNGB1 are thought to travel through different post-ER transport routes, with one pathway bypassing Golgi regions and the other passing through them. This study shows that CNGB1 depends on PRPH2, which suggests that these two routes may converge or interact at later stages and opens new directions for future investigation.

      3) The study is relevant to basic scientists and biologists investigating how membrane structures acquire specialized functions in neurons, and its implications extend beyond photoreceptor biology.

      Limitation of the study:

      I believe that clarifying these points will make the manuscript more significant.

      1) Is it not clear, as mentioned above, how PRPH2 contributes to the delivery of CNGB1 to the OSs in the different secretory pathways.

      2) The prior study using a fluorescence complementation approach (Ritter et al, 2011) suggests that PRPH2 and CNGB1 can associate within rod ISs, likely before their delivery to OSs. However, it remains unclear whether this interaction supports the potential cotransport of CNGB1 and PRPH2 or whether the authors view these proteins as being transported independently.

      3) At the end of the result section (Figure 6, rds-/- P45), the authors suggest that lamellar formation (evaginations?) is required for CNGB1 transport. However, CNGB1 is normally not seen in evaginations or lamellar structures, and thus the assumption is not consistent with prior findings.

      Overall, the manuscript is insightful and has the potential to advance our field and related disciplines.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The study by Amal et al. investigates how signaling cues regulate epithelial permeability using Drosophila oogenesis as a model system. During mid-oogenesis, a process known as patency occurs, in which tricellular junctions within the follicular epithelium transiently open, allowing yolk proteins to be transported from the hemolymph to the oocyte. The authors demonstrate that the spatial pattern of patency along the anterior-posterior axis of the egg chamber is inversely correlated with the activity gradient of TGF-β signaling. They further show that TGF-β signaling inhibits vertex opening and influences both actomyosin contractility and DE-cadherin levels. Importantly, although DE-cadherin is required for the TGF-β-dependent suppression of vertex opening, elevated actomyosin contractility itself does not appear to be required for this effect. Overall, this is a well-executed study that links a tissue patterning signal to the regulation of epithelial permeability. The experiments are clearly presented, and the quantification and statistical analyses are rigorous. I nevertheless have several points that should be addressed, either through additional experiments or through further discussion in the manuscript.

      Main Points

      1. Suppressing the effect of activated Tkv (TkvQD) by mad depletion is indeed good yet indirect evidence for the involvement of canonical (Mad-dependent) TGF-ß signaling. I believe a more direct way to reach this conclusion would be the generation of anterior mad loss of function clones which should mimic the tkv8 phenotypes.
      2. On a more general note, most of the results of the paper are based on the hyperactivation of the pathway using TkvQD overexpression. I find this limiting for two reasons: First, the levels of TGF-ß signaling are abnormally high under these conditions. In this context, the interpretation of the contribution of TGF-ß induced MyoII and MyoII activity is unclear. The authors find that TGF-ß signaling activates MyoII activity, however inhibiting actomyosin contractility by various means did not restore vertex opening. This is however at levels of Tkv activity that are far beyond normal (TkvQD). At the same time, the same manipulations are sufficient to open vertices in cells that experience peak, endogenous levels of Tkv activity (anterior cells). Does endogenous Tkv signaling induce MyoII, MyoII activity, Rho1 in anterior levels? Addressing this in tkv8 mosaics would be helpful. I can imaging that, unlike Cadherin which seems to be epistatic to TkvQQ, it is a very difficult to exclude a contribution of TGF-ß mediated actomyosin contractility and there is probably not a good experiment to address this. However, I do not agree with the statement of line 174 "Although.... MyoII activity is dispensable for TGF-ß -mediated inhibition of vortex opening..." I think more appropriate would be to state that MyoII is dispensable for the abnormally/experimentally high TGF-ß signaling-mediated inhibition of vortex opening...". The explanation would be that under these conditions the exceptionally high TGF-ß signaling bypasses the need for MyoII (maybe through exceptionally high adhesion). This is apparently not the case at physiological levels of TGF-ß signaling at anterior cells. Second, high levels of TkvQD, a protein that has been found to localize at junction in other systems, might have secondary effects in vertex opening for example by affecting their structural integrity or even by affecting endocytosis.
      3. The effects of clonal manipulation of TGF-ß signaling within the clones are clear and solid. Although this would not affect the statements of this paper, it would be good if the authors could comment on the effects at clone boundaries. What happens to "hybrid" TCJ when wild-type cells (at the respective position and patency status) meet a clone with elevated or reduced TGF-ß signaling?
      4. From a TGF-ß signaling-centric point of view: In this and other tissues, most of the TGF-ß signaling effects are mediated through the transcriptional repressor Brinker. The pattern of Brk expression is at the patency stage inverse to the pMad/ TGF-ß signaling activity (pMad represses brk transcription) and would in principle be identical in its graded profile with the pattern of vertex opening. Did the authors tried to manipulate levels of Brk? Is it possible to restore tkv8 phenotypes by simultaneously depleting brk?

      Minor points

      • Other than stated, not all egg chambers seem to be at stage 10 A in Fig. 1. Are the eggs shown in C older ?
      • The box in 2A is very hard to see
      • It is hard to correlate the dad::GFP-nls staining of 2A with the intensity profile of 2B. Is the quantification really at the sub-apical region as stated in the legend?

      Significance

      The findings of this study are highly significant and likely to be of broad interest, as they establish a strong link between a signaling pathway (TGF-β signaling), best known for its role in gene expression and tissue patterning, and a highly dynamic cellular process-the remodeling of epithelial junctions that regulates epithelial permeability. While the involvement of TGF-β signaling in this process is not entirely new (see Row et al., iScience, 2021), the present study provides a more detailed analysis and offers a molecular explanation linking TGF-β signaling to epithelial junction patency.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Amal et al investigate how canonical TGF-β signaling regulates tricellular junction (TCJ) remodeling during follicular patency in the Drosophila ovarian follicular epithelium. Using genetic mosaics, quantitative imaging, and perturbations of signaling and cytoskeletal pathways, the authors show that TGF-β signaling suppresses patency in a cell-autonomous manner.

      The authors convincingly show that TGF-β signaling prevents remodeling of tricellular junctions (TCJs) during patency. The figures and quantitative analyses are of an excellent standard, and I commend the authors on the clarity of their data presentation. Previous work from this laboratory demonstrated that patency is regulated by actomyosin activity. In the present study, the authors show that although TGF-β signaling increases actomyosin contractility, perturbation of downstream effectors of actomyosin contractility does not rescue the patency defect caused by constitutively active TGF-β signaling. This is a surprising and interesting result.

      The authors then show that TGF-β regulates patency through effects on E-Cadherin. However, the mechanism by which TGF-β signaling regulates E-Cad remains somewhat unclear. Although the authors show that E-Cad levels appear elevated when TGF-β signaling is activated, E-Cad overexpression alone does not affect patency. The authors also test whether the effect reflects a broader change in adhesion proteins by examining Fas2 and N-Cad, which appear unchanged, suggesting that the effect is specific to E-Cad.

      The introduction and discussion are scholarly and cite the appropriate literature. Overall, the manuscript is rigorous, clearly presented, and ready for publication.

      The experimental approaches are described in sufficient detail to allow reproduction, and the statistical analysis and quantification appear appropriate. The experiments appear adequately replicated, and the presentation of the quantitative data is clear.

      Major comments:

      N numbers for experiments Cells/Egg chambers appear to be missing. Please add these details.

      Single images showing no change in the localization of Fas2 and NCad found in supplementary are not convincing. The authors should quantify this data.

      Minor comments:

      Figure 2A: Instead of sagittal sections through egg chambers, it may be more informative to show the imaging plane that highlights the surrounding follicular epithelium, which would better illustrate the spatial organization of the follicle cells.

      Lines 73-85: Consider referring the reader to Figure 1A earlier in the text to help orient the reader to the architecture of the egg chamber.

      It would also be helpful to include the abbreviation CPFC in the schematic in Figure 1A to make the terminology consistent with the text.

      Significance

      This is an exceptionally well-written and well-presented manuscript. The story presented is logical and the work is carefully executed with top-level figures and quantification. The manuscript is logically organized and controls and statistical tests are appropriate. The authors provide convincing evidence through careful genetic manipulations that TGF-β signaling suppresses vertex opening primarily by reinforcing E-Cad-dependent adhesion rather than through actomyosin contractility.

      A particular strength of the study is the clear dissection of two potential downstream pathways of TGF-β signaling regulated patency- actomyosin contractility and E-Cad-mediated adhesion - and the demonstration that the suppression of patency depends primarily on E-Cad function. The manuscript represents a conceptual advance over the lab's previous work by demonstrating that patency is regulated by an upstream signaling pathway. Whereas earlier studies from this group established the cell biological mechanism of patency, this work shows that TGF-β signaling acts as a regulatory input controlling this process.

      The main limitation of the study is that the downstream molecular mechanism linking TGF-β signaling to stabilization of E-Cad at tricellular vertices remains only partially defined. While the authors show that TGF-β signaling increases E-Cad levels and promotes its retention at vertices many questions remain unclear as to how this is achieved. The data implicate p120-catenin as a possible contributor, but it does not appear to be required, leaving the mechanistic basis of E-Cad stabilization incompletely resolved.

      The primary advance of the study is conceptual and mechanistic, showing that morphogen signaling can control TCJ integrity by stabilizing cadherin-based adhesion independently of actomyosin contractility. The work therefore advances our understanding of of how epithelial junction remodeling is regulated during development in the common model system of the Drosophila ovary.

      In my opinion, the manuscript is exceptionally well presented and appropriate for publication essentially as-is.

      The primary audience for this work will be researchers studying epithelial biology, morphogenesis and developmental cell biology, primarily those working in Drosophila. The manuscript will also be of interest to the broader cell and developmental biology community because it provides evidence for how signaling pathways and morphogen patterning regulates epithelial architecture and barrier function.

      My expertise lies in epithelial morphogenesis, cell-cell adhesion, junction dynamics, and developmental cell biology and I use the Drosophila ovary as a model system. I reviewed the previous paper from this lab that went to Current Biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors explore how TGF signaling inhibits patency in the follicular epithelia of the Drosophila ovary. In this setting, patency is the opening of the tricellular junctions within the follicular epithelium (FE) covering the ovary to allow the transfer of yolk proteins into the underlying ovary. The authors first demonstrate that there is an inverse correlation between levels of Dpp signaling (based on a Dad-GFP reporter) to both the vertex (tricellular junction) opening size and the "circularity" of the FE cells, with Dpp signaling being highest at the anterior end. They show that activated Dpp signaling (Dad-GFP signal) is highest in the most anterior FE as are the highest levels of F-actin and MyoII (mCherry reporter) and that ectopic activation of Dpp signaling (using an activated receptor) in posterior FE cells is sufficient to induce higher levels of RhoI, junctional F-actin and MyoII at the tricellular junctions. However, neither knockdown of RhoI nor expression of a dominant negative form of MyoII have any impact on whether Dpp signaling blocks patency. Thus, although activated by Dpp signaling, MyoII activation is not required for Dpp to block patency. They show that Ecad is not present in the patent tricellular junctions, although it is present earlier and that Dpp signaling is required for enhanced levels of Ecad in anterior FEs and is sufficient to induce Ecad transcription (based on a lacZ reporter in the Ecad gene) and to increase Ecad protein levels. They show that Ecad is required to block patency regardless of Dpp signaling. They show that MyoII activity is not required for Dpp enhancement of Ecad protein levels. They show that Dpp signaling can increase p120cat levels and that p120ctn can increase Ecad levels. However, knockdown of P120cat has no effect on patency in either WT or TKV activated FEs.

      The experiments are nicely down and illustrated, and the paper is well written.

      I think the authors are overstating what they can conclude in both the title and abstract.

      Significance

      I think some of the conclusions cannot be made with the data in hand. Overall, the authors have shown that Dpp signaling enhances levels of several proteins that would be thought to block patency (Rho1, MyoII, F-Actin, p120cat, and Ecad (transcriptionally). They have shown that, except for Ecad, knockdown of most of these do not affect Dpp-dependent patency. However, showing that patency is severely enhanced in both WT and Dpp-activated cells with loss of Ecad is not sufficient evidence that Dpp signaling works through Ecad. Taking away Ecad is going to cause near or complete loss of AJs - thus, it is no surprise that patency is enormously increased everywhere. Importantly, overexpression of Ecad (or of p120cat, which increases Ecad levels) did not block patency. Indeed, it seems like the only manipulation that mimics the effects of Dpp activation on patency is blocking endocytosis - so this seems a likely mechanism (it could also explain the higher levels of p120cat and/or Ecad at junctions). Overall, I agree that the authors can conclude that the Rho1 activation of MyoII observed downstream of Dpp signaling does not impact repression of patency. However, since overexpression of Ecad had no impact on patency, I think they can only conclude that the Ecad expression is enhanced downstream Dpp signaling but that this increase in Ecad expression is insufficient to block patency on its own. Thus, the the title and abstract should be modified to more accurately reflect the conclusions that can be made.

      Minor suggestions

      Figure 1G. Please clearly indicate where the clone of tkv8 null cells is located within the follicular epithelium.

      In my opinion, both supplemental figures should be included in the main body of the paper. They make important points relevant to the conclusion. Figure S1 should be included as part of Figure 3. Figure S2 should be included a stand-alone figure, as there are currently only six figures in the manuscript and the panel in that figure showing that blocking endocytosis blocks patency is an interesting and potentially relevant finding.

      In its current state, the paper is most appropriate for a specialized reader in the field of Drosophila oogenesis. If the authors were to follow up on a potential link between Dpp signaling and endocytosis and find such a link, then I think it would be of more general interest.

      The time estimate below is based on not doing major experiments. If the authors were to follow up on the observation regarding endocytosis, it would be more in the 6 month range.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      To improve the quality of this study, consider implementing strategic improvements that might enhance the significance of your intriguing findings. The results showing that pyruvate can circumvent IFT88 reliance represent a substantial change in our understanding of ciliary assembly; however, the paper would benefit from a more thorough examination of the mechanisms behind this physical development. Since IFT88 is usually seen as the main "elevator" for ciliary parts, figuring out if other proteins like IFT81 or IFT52 are being reused or increased when pyruvate is present will provide a clearer understanding of how this bypass works.

      While you have successfully outlined the signaling pathways linked to tubulin acetylation and detyrosination, the connection between histone acetylation and MAPK signaling poses a complex question. Figuring out if EP300-mediated acetylation starts the MAPK cascade or works as a feedback loop-possibly through specific inhibition tests-would improve the clarity needed for scientific publications. Furthermore, given the pronounced impact shown in colonic fibroblasts, it would be prudent to investigate if this pyruvate-induced ciliogenesis is a ubiquitous biological phenomenon by doing the same experiment in a conventional model, such as RPE1 cells. This would assist in ascertaining if you have discovered a fundamental metabolic principle of biology or a specific adaptation of the gastrointestinal system.

      Concerning the findings on tubulin detyrosination, there exists a little discrepancy: VASH inhibition influences ciliary length at elevated pyruvate concentrations, but the Western blots do not clearly show the predominant alterations in detyrosination at the same concentrations. To address this discrepancy, one may employ high-resolution immunofluorescence to assess detyrosination selectively within the ciliary axoneme, rather than examining the entire cell. This would likely disclose the localized alterations indicated by your functional data. In the discussion about the DSS-induced colitis model, understanding how pyruvate works as both an energy source for colon cells and an antioxidant, along with its effects on cilia, would strengthen the case for its potential as a treatment. Improving these detailed understandings and clarifying which cell types are involved will elevate the paper from a niche discovery to an important addition to cell biology and mucosal immunology.

      Prospective other Improvement Areas Analyzing the MAPK/Histone Acetylation Feedback Loop:

      1.The findings indicate that histone acetylation and MAPK signaling both play a role in pyruvate-induced ciliogenesis. Comment: As said, it is still unknown if histone acetylation triggers MAPK, or the other way around, or whether they create a feedback loop. Incorporating particular tests, such as assessing MAPK activity while blocking EP300 and vice versa, might elucidate this hierarchy. 2.The article suggests that pyruvate's capability to bypass IFT88 may be exclusive to colonic fibroblasts or certain cell types. Comments: Evaluating this effect in a widely utilized ciliary model such as RPE1 or IMCD3 cells will substantially enhance the paper's significance by ascertaining if this is a universal or specialized biological process. +1 3. The work demonstrates that PC forms in the absence of IFT88 when pyruvate is available, although it fails to elucidate the mechanism of structure assembly without this essential transport protein. Comment: Examining if additional IFT proteins (such as IFT81 or IFT52) or alternative transport pathways are elevated or repurposed in the presence of pyruvate will significantly enhance the understanding of the "bypass" discovery. 4.The authors noted that VASH inhibition (LV80) decreased PC length at both 2mM and 10mM pyruvate, however bulk detyrosination alterations were only observable at 2mM. Comment: Although the authors explain this to the "higher sensitivity" of PC length measures, including high-resolution immunofluorescence quantification of the ciliary axoneme, rather than overall cell levels, might furnish the necessary visual proof for detyrosination alterations at 10mM. 5.The authors appropriately recognize that pyruvate may have effects on colitis that are independent of PC. Comment: To give a more comprehensive picture of pyruvate's therapeutic advantages, it would be helpful to broaden the interaction to briefly clarify how its ciliary effects could work in conjunction with its recognized functions in antioxidant defense or epithelial energy metabolism.

      Significance

      The study identifies pyruvate as a distinctive environmental regulator of ciliary length and ciliogenesis in colonic fibroblasts (CF).

      A major discovery is that pyruvate can help produce primary cilia in cells lacking IFT88, challenging the earlier belief that IFT88 is essential for making primary cilia.

      The authors clearly explain the signaling pathways, showing that pyruvate affects the amount of primary cilia by changing tubulin acetylation (which involves acetyl-CoA and ATAT1) and influences the length of primary cilia by altering tubulin

      Strong evidence from experiments with Col6a1cre-Ift88flx/flx mice in a DSS-induced colitis model strongly backs the importance of these findings for both biology and potential treatments.